On Thursday 01 July 2010 21:46:50 Howard Thomson wrote: > Hi Kern, > > On Thursday 01 July 2010, Kern Sibbald wrote: > > Hello Howard, > > > > What does "chunked" backup mean exactly? I am not sure what the high > > level concept is here. Bacula can already backup multi-gigabyte virtual > > disks, so obviously you are thinking about something different. > > The concept that I am calling 'chunked backup' is sub-file incremental > backup. > > Currently, for a 10Gb Virtualbox virtual disk, a Full-backup will backup > the whole file. > > Subsequent incremental backups, where perhaps only 1Mb of the virtual-disk > has changed, will backup the entire [10Gb] single file, because it has > changed. > > Bacula currently records a hash-value for the entire file, whereas I am > intending, in addition and for appropriately large files, to record a > hash-value for sub-file chunks, to be able to selectively not backup those > chunks when doing an incremental / differential backup.
OK, now I understand. This is a feature that we are working on -- it is actually a form of deduplication. Before implementing it, there are a number of things that need to be decided and some important changes in Bacula that need to be made. 1. By the way, I call these "deltas" that is it is some change to the originally backed up image that must be applied. However, what is different from an Incremental is two things: 1. only a part of the file is saved. 2. *all* the deltas must be restored (not just the most recent as is what happens for incremental backups). 2. From the above, you can see that we need some way of marking these as deltas rather than incremental. Perhaps it could simply be called a "delta" backup level rather than Incremental. 3. We need to decide how the "deltas" are going to be generated -- there needs to be something to figure out what has changed, which means, in general, you need access to the previous backups or some form of hashing done by deduplication code. 4. Determine how the deltas are gong to be stored -- actually, IMO, that is trivial it just needs a very small amount of code that looks much like the sparse file handling code -- we may even be able to use the same code. > > I want to use Bacula to do full + incremental backups of my own system, to > disk, without separating out virtual-disks into separate backups, with > different recycle criteria for space constraint reasons. > > Current [admittedly] simple-minded incremental backups of my file-tree are > much larger than they need to be ... Yes, much larger. We have some Bacula Systems scripts that help with this for VirtualBox, but it is not integrated with Bacula as deltas would be. This whole subject is non-trivial. Best regards, Kern ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
