Linda Walsh wrote: > Bob Proulx wrote: > > Meanwhile... I would be one of those suggesting that perhaps you > > should try using rsync instead of cp. The cp command is lean and > > mean by comparison to rsync (and should stay that way). But rsync > > has many attractive features for doing large copies. > > ---- fwiw...--- > Like large execution times... from the latest snapshot on my system -- > I use rsync to only move differences between yesterday and "today[whenever > new snap is taken]"... it was a larger than normal snap -- most only > take 75-90 minutes...but rsync (these are the script messages) with some > debugging output still turned on... even an rm over the resulting diff > took 101 seconds... then cp comes along.. even w/a sync it would > still be under a minute.
Wow. Just to be clear an rsync copy took 75 to 90 minutes but a cp copy took less than 1 minute? I find that very suspicious. I never see that much difference between them. Are you sure the difference wasn't that the data was cached into ram by the rsync and therefore the second run with cp just ran with the warmed up cache? With a large data set and a large ram that is plausible. > I.e. rsync copied just the diffs to "/home.diff", then > find with "-empty -delete" is used to get rid of empty dirs (rsync > creates many of these). then a static partition is created to hold > the "diff" output -- and cp took walked and copied the tree in 12s. > (output wasn't flushed, but it's not that long.. <a minute...). It appears that you are using features from rsync that do not exist in cp. Therefore the work being done in the task isn't equivalent work. In that case it is probably quite reasonable for rsync to be slower than cp. Also consider that if cp were to acquire all of the enhancements that have been requested for cp as time has gone by then cp would be just as featureful (bloated!) as rsync and likely just as slow as rsync too. This is something to consider every time someone asks for a creeping feature to cp. Especially if they say they want the feature in cp because it is faster than rsync. The natural progression is that cp would become rsync. > If rsync wasn't so slow at local I/O...*sigh*.... The advantage of rsync is that it can be interrupted and restarted and the restarted process will efficiently avoid doing work that is already done. An interrupted and restarted cp will perform the same work again from start to finish. If I am doing a simple copy from A to B then I use 'cp -av A B'. If I am doing it the second time then I will use rsync to avoid repeating previously done work 'rsync -av A B'. If I want progress indication... If I want placement of backup files in a particular directory... If I want other fancy features that are provided by rsync then it is worth it to use rsync. $ du -s coreutils 238920 coreutils $ find coreutils -type f | wc -l 15013 $ rm -rf junk/coreutils # echo 3 > /proc/sys/vm/drop_caches $ time cp -a coreutils junk/ real 1m2.137s user 0m0.140s sys 0m1.724s $ rm -rf junk/coreutils $ time cp -a coreutils junk/ real 0m2.492s user 0m0.060s sys 0m1.064s $ rm -rf junk/coreutils # echo 3 > /proc/sys/vm/drop_caches $ time rsync -a coreutils junk/ real 1m5.473s user 0m1.280s sys 0m2.112s $ rm -rf junk/coreutils $ time rsync -a coreutils junk/ real 0m3.215s user 0m1.184s sys 0m1.536s For normal use cp is a little faster than rsync. Or rather rsync is a little slower than cp. But not enough to make a difference for typical operations. Having the file system cache warmed up makes a *HUGE* difference. Much larger than any other difference. For copies that take hours to run I am probably going to value the restart ability more than raw speed. YMMV. Bob