https://bugzilla.samba.org/show_bug.cgi?id=13735

--- Comment #3 from Sébastien Béhuret <sbehu...@gmail.com> ---
Thank you for suggesting the patches repo. An improved checksum/maybe-checksum
algorithm would be great but there appears to be a lot of work to achieve this.
Checksums are very handy for special cases (e.g. to detect and fix data
corruption) but are still relatively slow and prone to collisions or require
specific patches as you suggested. We ideally want the possibility to enforce
the synchronization of files that are more recent on the sending side when
mtime and size are identical on both sides. This would improve the reliability
of system backup software that are based on rsync, and could be implemented as
a new option to alter the behavior of the quick-check algorithm.

Overall, rsync lacks a solid way to detect and transfer back-dated files. I
feel like the importance of dealing with back-dated files is underestimated:

In a file system, file back-dating may occur during software updates without
malicious intent and users being aware of it. An example of file back-dating is
found in Firefox package in Debian-based distributions. Some JS files in
/usr/share/firefox/browser/defaults/preferences/ directory are always dated
2010-01-01 00:00:00. When changes in these files are small (e.g. a version
string, a fixed-size series of characters such as a timestamp, hash or key),
the files end up with the same size and mtime and the changes won’t be detected
by rsync quick-check algorithm. Backup software relying on rsync for
incremental updates will eventually get wrong unless they use the --checksum
option, but this is sub-optimal (and sometimes buggy) and most backup systems
don’t even allow the user to add this option.

Quick fix suggestion:

This may be a bit of an oversimplification, but assuming that the current rsync
quick-check algorithm looks like this:

synchronize(source, dest) IF [ mtime(source) != mtime(dest) AND size(source) !=
size(dest) ]

Then a new option (e.g. --use-ctime or --ignore-times-if-newer) could alter it
in the following way:

synchronize(source, dest) IF [[ ctime(source) > ctime(dest) ] OR [
mtime(source) != mtime(dest) AND size(source) != size(dest) ]]

(Notice the use of ‘greater than’ rather than ‘not equal’ to compare ctimes.)

This would do the trick and ensure that files that were back-dated are properly
detected and synchronized during incremental updates. I think that such an
option is a must-have for reliable backup software, and could even be enabled
by default since atime updates do not alter ctime.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to