Matt McCutchen wrote:
On Fri, 2006-02-24 at 18:40 -0500, Linus Hicks wrote:
I did something similar to what lsk is doing a few months back, I believe using
rsync 2.6.5. I wrote a script to query the database for all the datafiles and
rsync'ed them individually by specifying the full path to the file. What I found
was that if I didn't use --no-whole-file, it did operate in whole-file mode. I
was not doing local transfers, so is there some other condition that causes it
to default to whole-file mode?
Not that I know of. But according to the OLDNEWS file in the
distribution, a bug causing whole-file mode to be the default even for
remote transfers was fixed between 2.5.4 and 2.5.5. Is it possible that
the rsync on one or both ends was 2.5.4 or older?
I re-checked the version of rsync on several systems and they all say 2.6.5 and
I'm pretty sure the SA's have not upgraded it.
(For reference: rsync considers a transfer between two paths in a
computer's filesystem local even if NFS or a similar network filesystem
implements one or both ends. This makes sense because limiting "disk"
I/O (really network filesystem I/O) is more important than limiting
network I/O (the fast loopback interface).)
I have used rsync with an NFS mount before and noticed the difference. The case
I am talking about was not a local transfer.
The issue of not using --inplace and atomically moving it over the original is
complicated by using --temp-dir. lsk has not raised the issue of not having
enough room for a second copy of any of his datafiles, so he probably isn't
using --temp-dir. However, the statement you made earlier in this thread (quoted
below) needs to be extended to account for the case where a --temp-dir resides
on a different partition:
"Not exactly: if --inplace is not used, rsync will write a temporary file
and atomically move it over the original. --inplace uses less disk
space but does not provide atomicity and, according to the man page,
reduces the efficiency of the incremental transfer algorithm."
The behavior of rsync with a temp dir on a different partition changed
in 2.6.7. See this request for enhancement:
https://bugzilla.samba.org/show_bug.cgi?id=3461
The man page of CVS rsync 2.6.7 now has a detailed discussion of the
issue. You can read the man page here:
http://cvs.samba.org/cgi-bin/cvsweb/rsync/rsync.yo
Cool!
And a performance question: would it be faster to pass the complete list of
datafiles to rsync in one fell swoop, for instance using --files-from rather
than running rsync individually on each one?
It would be somewhat faster to pass the entire list because you incur
the overhead of setting up the rsync process triangle once, not for
every file. Furthermore, the rsync protocol is pipelined. If you have
a network with high bandwidth but considerable latency, calling rsync
once will take advantage of the pipelining while calling it for each
file will wait for several network round trips per file.
Thanks Matt. I thought that might be the case.
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html