https://bugzilla.samba.org/show_bug.cgi?id=12570
Bug ID: 12570 Summary: Problems with --checksum --existing Product: rsync Version: 3.1.1 Hardware: All OS: All Status: NEW Severity: normal Priority: P5 Component: core Assignee: way...@samba.org Reporter: a...@smasher.org QA Contact: rsync...@samba.org Problem: I've got an sd-card with some movies, a few of which are corrupted files. I want to copy only the files that don't match the good files. command: rsync --checksum --existing -vhriP /movies/ /media/128-SD/Movies/ The problem here is that *all* files in "/movies/" are hashed before anything else happens. This can be verified with lsof: "lsof +D /movies". I've got <100GB in "/media/128-SD/Movies/". I've got >1.5TB in "/movies/", and hashing all of those files is just a huge waste of time and system resources. When "--existing" and "--checksum" are both used, the algorithm should first make a list of candidate files, then start hashing. It should *not* start hashing everything on the send-side and then figure out which files might be needed. Workaround for me: diff -r /movies/ /media/128-SD/Movies/ | grep differ | awk '{print "pv " $3" > "$5}' | sh nb, that workaround requires "pv" and only works with file-names that do not contain spaces, but for me it's a quick and easy way to see progress while files are being copied. "cp" would work fine in place of "pv". On my system, that workaround saved my about 1-2 days of hashing, and completed in less than an hour. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html