jw schultz wrote: > > I was thinking more in terms of no block relocation at all. > Checksums only match if at the same offset. The receiver simply > discards (or never gets) info about blocks that are > unchanged. It would just lseek and write with a possible > truncate at the end.
This would seem to help a lot on larger database files. Why look at a 700 byte block of data from a source file and try to find a matching block by fully scanning block checksums at all offsets in a 8G destination datafile? And then doing it again for every 700 bytes? (I read the rsync technical paper -- but I might be confused) In the case of Oracle data files the only place a meaningful/syncable delta will occur is at the same offset. Yes this is a special case -- but it has the potential to really help in rsyncing oracle datafiles during a hotbackup or when syncing from a snapshot to nearstore storage. This approach should be faster than the -W option for very large Oracle datafiles (which often have small amounts of changed blocks). It should also be faster than deleting the destination files and resending (-W) like has been suggested. > > You can imagine a smarter algorithm that does non-sequential writes to > > the output so as to avoid writing over blocks that will be needed > > later. Alternatively, if you assume some amount of temporary storage, > > then it might be possible to still produce output as a stream. > > I really doubt it is worthwhile doing to rsync. This > principly applies to block oriented files such as devices > and database files. For the most part rsync handles these > fine. agreed. The original post still raises an interesting issue -- it should not be faster to remove destination files before running rsync. That is counter to one of the main purposes of rsync -- efficiently detect and send only the deltas. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html