Progressive Virtual Full Backups – The Best of Most Worlds!

Data is one of the most critical assets an organization needs to survive and grow. Therefore, it’s vital the information that matters most is reliably backed up in a way that makes it quickly available should something happen to the primary source. Progressive Virtual Full (PVF) Backups provide an efficient way to safeguard data more reliably, easily, and cost-effectively than most traditional methods.

It should be noted that backup strategies come in many flavors, and there is no “one size fits all” solution. The are trade-offs based on a number of variables such as budget, speed requirements, retention periods, and restore objectives. Also, be sure not to confuse the difference between snapshots your cloud provider might offer and a comprehensive file-based backup strategy. Before we begin, let’s take a look at some commonly used methodologies.

Full Periodic Backups

This is the most basic technique. One just stores an entire (full) copy of a dataset to another storage device or media at the desired frequency. For example, you would completely back up your data every week or daily. Full backups are helpful on the restore side because all the data is in one place. The disadvantages are long backup times, the storage space needed, and inflexibility compared to other methods.

Differential Incremental

Often referred to simply as incremental backups – this starts with a full backup. From then on, only the data that changes each following day (or desired period) gets copied. For example, if you were on a weekly rotation, a full backup would be done on Sunday night. On Monday night, an incremental backup would be performed that only captures the changes for that day. On Tuesday night, another backup would be done to reflect changes on Tuesday, and so on. At the end of the week, you would have the full backup, plus separate “incrementals” for each day from Monday through Saturday evening. An issue with this approach is that if any of the backups become damaged, you can only do a full restore to that point in time.

Cumulative Incremental

Similar to the differential process, an initial full backup is created. From there, data is backed up in a “building” manner on each subsequent day (or desired period). Let’s use our weekly model again and assume we started with a full backup Sunday evening. The changes in the data on Monday would be backed up. However, on Tuesday, we would make a backup containing the data that changed on that day, plus Monday’s changes. On Wednesday, we would create another backup to include all the changes for Monday, Tuesday, and Wednesday. At the end of the week, you would then have the full backup, plus separate backups for each day from Monday through Saturday evening – each of these backups includes all the changes that came before it.

For both types of incremental procedures, sophisticated software using time-stamping or other methods manage the changes to ensure proper restores can be done. Since you’re dealing with less data, incrementals can be used more often since less storage is required. However, the software overhead to recompile these backups goes up over time and restores become increasingly slower since you need each incremental backup along with the full backup for a restore.

For each of the methods described above, when the retention period ends – the cycle begins again and the original data is effectively overwritten. If you want to keep full backups over a longer period of time, archiving is a good option.

Clusterlogics Evolution to Progressive
Virtual Full Backups

Our original backup solution started as a process whereby we created three network-based full backups per retention period (start, middle, end). Next, we enhanced the process and our technology to create virtual full backups to avoid congestion over the various networks. In this scenario, we ran an original network-based full (seed) backup, and then “forever incrementals” over the network. Future required “fulls” were then created directly on a backup storage drive using the last full backup plus all the incrementals. This resulted in three “fulls” per retention (start middle and end), but, only the original full backup was ever taken over the network. The final step in the evolution was our PVF Backup offering which we will explain next.

How do Progressive Virtual
Full Backups Work?

First, an initial full backup (seed) is created and sent over the network to storage.
Next, on-going cumulative incremental backups are taken at selected time intervals. For example, these would be daily if we use our week-long model as the retention period. The interval frequency can be virtually anything you want. Ideal retention periods are typically a few months or less.
Once the retention point has been reached, the process changes by taking the last full backup, adding the incremental changes during the initial retention period, and then creating a new PVF Backup directly on the backup storage device.
From that point on, data that changes during the chosen interval period (daily for example) will continually be recompiled with the PVF Backup to create a new, up-to-date version of itself.
In essence, we always maintain a current copy of a recompiled full backup based on the original data. This provides a very recent and entire dataset that is accurate and extremely fast to restore.

We believe PVF Backups offers some of the best principles from past policies, combined with an innovative new approach. It ensures that the original seed backup never ages as it’s technically re-created each day. It also conserves disk space on the backup storage device since only one full copy is stored at a time with incrementals making up the rest of the retention.

Our new PVF Backup capability offers our clients and partners a way to ensure the highest level of backup and restore integrity with cost-effective ease and efficiency.