Re: rsync error on large directories?

John Summerfield Thu, 30 Aug 2007 18:01:43 -0700

Nathan Moore wrote:

Hi,


I've been using rsync as a primitive backup tool on a small cluster of SL45
and SL5 machines.  Lately, there is an intermittant error when I run rsync
to backup a large (20GB) directory of mixed file types.  The error isn't a
loud failure, but rather just that the filetransfer stalls and the node the
files are being copied from locks up (the lockup is complete - the "server"
node is unavailable via NIS, ssh, or console login - it has to be
powercycled)

Is there a known bug in rsync?  Is there a way to trouble-shoot my "server"
machine?

Volume of data isn't the only measure of "large" - the number of filesis important too.

Some time ago (debian Woody+RHL7.3) I had a problem with rsync timingout when backing up (most of) my Woody filesystem over ADSL.

I took up the issue on the rsync list where the folk were very helpful.The thread was "reliability and robustness problems" about Oct 04.

By default, rsync does not timeout, so one really needs to specify atimeout value.


then, I found it timing out too readily.

It also used an enormous amount of RAM: it's the only program I knowthat can cause Linux to use swap (many times*real ram) and not causethrashing.

As best I can figure it, rsync was building a filtered view of thetarget files area and (no doubt) the source files area, and neither sidetalks to the other while this is happening. I think this was taking anhour or so, but this _was_ a few years ago.

The rsync gurus opined that it was better to backup this way than tobackup a single file, but my experience suggests otherwise; I now createa filtered filesystem image and use rsync to update that.

While rsync is building its lists of what to transfer, systems at bothends can get rather busy, particularly if something else is runninginterference on the use of ram RAM.

This, of course, can cause a bit of distress to both computers, but ifthey really are locked up as opposed to being seriously overtaxed, thenyou have either a kernel bug or a hardware problem. Nothing rsync can doshould cause the system to actually lock up.

I think I would start by directing syslog (kernel messages at least) toanother box, or to a printer on the parallel port. Look for signs of theoom killer at work.


You might also do something as crude as adapting and running this:
 while :
  do
    ps xar | logger -i
    sleep 1m
  done
while making sure the logged message go to Somewhere Else



--

Cheers
John

-- spambait
[EMAIL PROTECTED]  [EMAIL PROTECTED]

Please do not reply off-list

Re: rsync error on large directories?

Reply via email to