Hi all, I am currently backing up several similar systems onto one volume, as probably most people do. To save time and space, I have the following suggestion:
In principle it is unnecessary to store the same content twice on the same volume. I assume that, during restore, a volume can either be read completely or completely gets lost, so there is no immediate benefit from storing the same content twice on one single volume. Now it would be nice if bacula would never put the same file *content* twice on the same volume. This could happen in a transparent manner: bacula has a database with all files stored on each volume. When a request to store a file arrives, bacula could easily determine if this file has already been written to the volume earlier (looking up the file size and MD5 sum in all entries in the database for this volume) and then store something like a "backlink on the tape" together with the filename (which may be different from the other file with the same content) instead of the file itself. Personally I think about the following implementation: The FD could, before transmitting a file, transmit the path/name, size and MD5 sum beforehand. Then then the SD could perhaps decide if it still needs the data or if it could simply store this "hardlink" instead. The SD would probably have to consult the director for database lookup. Perhaps this idea could be merged with the "basefiles" concept: a file would become a basefile automatically when the same content arrives a second time on the same volume. But, according to my idea, there would never be a need to store the "basefiles" separately, so no new backup level and/or strategy would be needed. Of course, restores will become more complicated: instead of working through one single physical session, multiple sessions will have to be read. This concept would make volume handling much more flexible. A volume would never overflow because of storing the same set of files multiple times. Full Backups would automatically shrink when inadvertently done twice on the same tape. Logically, each volume would still contain the same data, just in a different form. Effectively this would just be an "abstraction layer" above the physical volume. This would mean that most procedures would not change much, there is no need for a new concept (like "static files" or the like). There are some more complications, e.g. when considering spooled catalog updates while streaming to a fast tape device, but these should not be unsurmountable. What do you think about this? Regards --Marcel ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users