Would it be possible to store rdiffs of the file instead of the whole file.

I mean:
In the pool/cpool directory, instead of naming the file by its md5 sum, a 
directory with that name could be created and the file would be named cur and 
rdiffs would be named 0 1 2 3 4 ....

At each storage, of course, the directory would be renamed with the correct 
md5sum and upon an existence, a merge could be created. (case where you own 
V0 but V1 is already backed up from elsewhere, then you recieve V1).

At restoration time, the BackupPC_zcat process would check the header of the 
file (which is a BackupPC compressed file with it's own format) and:
- If the file is marked as complete, simply restore it
- if the file is marked as rdiff, it would check the the header for the parent 
file and so.

Of course it is easier to say than to code, and for sure, there are some cases 
that I've surely missed. More over, this feature would be mainly usefull for 
PST like files and as BackupPCd is not yet ready (thus openned files are not 
backupable), this feature is not a priority, but IMHO, in the not so long 
term, this wouls be a wonderfull feature. (Altiris has this feature).

As BackupPCd will compute the diff before trandfer, the diff file is already 
created and no more file reading is needed to compute the diff. RPM has 
recently adopted this strategy with RPM diff (though far less difficult to 
implement). (IMHO)

Best Regards,

Olivier.

On Wednesday 14 September 2005 06:53, Craig Barratt wrote:
> Jean-Christophe Pinoteau writes:
> > I am newbie on BackupPC and I am planning to use it to backup users
> > data on several hosts.
> >
> > There is a question I couldn't  find an answer for in the manual: how
> > does BackupPC deals with large files like databases ? When BackupPC
> > creates a new increment will it store the all file or just a delta of
> > what has changed ? The consideration is quite important regarding to
> > space planning and bandwidth usage.
> >
> > I was thinking to use rsync over ssh (if that makes any difference).
>
> With rsync just the changes in the file will be transferred.
> So the bandwidth requirements, after the first time, only
> need to handle the changes.
>
> But the complete file is reconstructed and stored on the
> server, even for an incremental.  So a very large file
> with just a small change requires a new complete copy
> to be stored on the server (compressed if enabled).
>
> Because of pooling, only one copy of each unique file
> needs to be stored on the server.
>
> Craig
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by:
> Tame your development challenges with Apache's Geronimo App Server.
> Download it for free - -and be entered to win a 42" plasma tv or your very
> own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/backuppc-users
> http://backuppc.sourceforge.net/

--
        Olivier LAHAYE
        CRM IT lab manager
        Saclay, FRANCE


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. 
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to