> From: [email protected] [mailto:discuss-
> [email protected]] On Behalf Of Steve Harris
> 
> 1) Using a tar pipeline will (should) always be slower than a single
> process (e.g., cp, cpio -p, rsync), because of the overhead of the two
> processes and the system buffering for the pipe.

Depends on the bottleneck.  If you have super fast IO capabilities, then the 
overhead of copying from RAM to RAM before hitting the IO could actually affect 
your performance.  But that's a rather unusual situation.  Usually your RAM is 
so much faster than the IO, it doesn't matter.

Which is slower:  Being stuck behind a granny using a walker, in your Toyota, 
or being stuck behind a granny using a walker in your Maserati?


> 2) Copying to an NFS-mounted filesystem is likely to be less efficient than
> alternatives (e.g., rsync) because of the NFS overhead -- it looks like a
> local filesystem but in fact there is a lot of network processing happening
> behind the scene.

Ah - Agreed - but there's a distinction to make here.  "rsync" the application, 
versus "rsync" the protocol.  You can use the rsync application from local disk 
to NFS mount, and obviously, you incur NFS overhead.  This is the situation the 
OP has been discussing.  It will work fine, but if you want to performance 
optimize ...  You're right, you can enable rsync in daemon mode on the 
receiving system, and use rsync:// protocol.  In this mode, the rsync client & 
server are able to significantly reduce the amount of data that needs to cross 
the wire (each system locally checks file statistics, etc, and only sends the 
smallest relevant data across the wire, as opposed to rsync from local fs to 
locally mounted remote fs, which requires the rsync application to perform all 
those operations itself, *across* the wire).  I honestly *do* believe using 
rsync protocol, with enabling rsync daemon on the receiving system, will be 
faster than the NFS option, assuming the network is your rat
 e limiting factor.


> 3) I'm not an expert on rsync, but wasn't it (initially) written in a
> client-server mode to achieve very high efficiency copying files over a
> network?  Especially when updating (large) files which may have changed
> slightly.

"high efficiency" is a relative term.  It does a good job of skipping over 
files that don't need to be sent, and only sending chunks of files that have 
changed, but in order to do that, it needs to crawl the local and remote 
filesystems (if using daemon mode, the remote daemon still needs to crawl the 
whole filesystem locally), do a bunch of comparisons between local & remote, 
search for and calculate all those differences.  This is NOT comparable to the 
performance of things like ZFS and BTRFS incremental updates.  But in the 
absence of a COW (or equivalent) underlying filesystem, rsync is the fastest 
thing that I know.
_______________________________________________
Discuss mailing list
[email protected]
http://lists.blu.org/mailman/listinfo/discuss

Reply via email to