On Thu 15 Apr 2010, Richard wrote: > > wouldn't it be possible to touch the file to update the modification > > time prior to running dirvish? > I'll have to try something like that. I avoided that because I didn't > want to update the entire file.
Eh, you will *always* have to update the entire file! I would be majorly pissed off if rsync decided to e.g. only update the first half. Even if only one byte of a 500GB file is changed, rsync will still have to update the entire file. Of course, its delta algorithm will prevent it from *transferring* the entire file. > Speaking of updating the entire file, I found a major-major gotcha. > When rsync copies a file without leaving the local system, it NEVER does > a delta copy. In its infinite knowledge and wisdom it just does a > straight copy-replace. When backing up a local filesystem to a local That's by design and a very good design decision it is too. To use the delta algorithm rsync needs the entire source file *and* destination file to be read to determine what blocks need updating. Then, while creating the new version, it will read the destination file *again* to extract those blocks that haven't changed. In short, there is *way* less IO load by simply doing a local copy from the source to the destination. Rsync is designed to decrease network traffic at the expense of more IO. When the additional IO makes no sense (ie. there is no network traffic) then the delta algorithm is disabled. > encrypted dirvish vault, it is important that the vault be set to a > remote ip address such as *cough* 127.0.0.1 so that rsync will use > deltas. I saw a reduction from 30gig a night of data change to 6 gig a > night of delta changes. (1/5th the size) How did you measure this? Did you also check how the total duration of the run was influenced? I would never contemplate forcing rsync into delta mode by using 127.0.0.1 as the "remote" IP. Paul _______________________________________________ Dirvish mailing list [email protected] http://www.dirvish.org/mailman/listinfo/dirvish
