subject:"problems encountered in 2.4.6"

Re: problems encountered in 2.4.6

2001-05-31 Thread Remi Laporte


I've had rsync hangs when transferring hug filesystems (~80Gb) over network,
but till i've suppress the -v option from my command line there's no hang
anymore hang.
The -v option under 2.4.6 is bugged, try to mutiplie v's and the hangs will
increase too.

( rsync -axWP --rsync-path=/usr/local/bin/rsync --stat --delete   source
target)

David Bolen wrote:

> [EMAIL PROTECTED] [[EMAIL PROTECTED]] writes:
>
> > Actually, the lack of -W isn't helping me at all.  The reason is that
> > even for the stuff I do over the network, 99% of it is compressed with
> > gzip or bzip2.  If the files change, the originals were changed and a
> > new compression is made, and usually most of the file is different.
>
> Just to clarify, when you say "over the network" you mean in true
> client/server rsync (or across an rsh/ssh stream) and not just using
> one rsync with references using network mount points, right?  In the
> latter case, not having -W is hurting you, never helping.
>
> But yes, any format (e.g., encryption, compression) that effectively
> distributes changes randomly over a file is going to be a killer for
> rsync.
>
> For the case of gzip'd files when a client and server rsync are in
> use, you may want to look back through the archives of this list -
> there was a reference to a patch for the gzip sources that created
> rsync-friendly gzip's.  Not as great as the non-gzip'd version, but
> far better than normal gzip.
>
> Ah yes - here was the URL:
>
> http://antarctica.penguincomputing.com/~netfilter/diary/gzip.rsync.patch2
>
> At the time when I tried it (1/2001), here were some test results:
>
> For comparison, here's a database file (delta between one day and the
> next), both uncompressed and gzip'd (normal and -9).  For the
> uncompressed I also transferred with a fixed 1K blocksize since I know
> that's the page size for the database - the others are default
> computations (I tried the 1K with the gzip'd version but it was
> worse, as expected).
>
> Normal Normal+1Kgzip   gzip-9
> Size54206464   54206464 21867539   21845091
> Wrote29021821011490  31698643214740
> Read   60176 31764860350  60290
> Total29623581329138  32302143275030
>
> Speedup18.30  40.786.77   6.67
> Compression 1.00   1.002.479  2.481
> Normalized 18.30  40.78   16.78  16.54
>
> And in terms of size:
>
> As Rusty's page comments, they are slightly larger, but not
> tremendously so.  In my one case:
>
> Normal gzip:21627629
> gzip --rsyncable:   21867539
> gzip -9 --rsyncable:21845091
>
> So about a 1-1.1% hit in compressed size.
>
> Personally, here we end up just leaving the major stuff we transfer
> uncompressed - as we're using slow analog lines, the cost recovery was
> easily worth the cost in disk space, particularly in cases like our
> databases where knowledge of the page size and method of change goes a
> long way.
>
> > It definitely helped for transferring ISO images where the whole image
> > would be changed if some files changed.  I set the chunk size to 2048
> > for that.  Why it defaults to 700 seems odd to me.
>
> Not sure - perhaps some early empirical work.  When I'm moving files
> that I know something about I definitely control the block size
> myself, so for example, when moving databases with a 1K page size, I
> always use a multiple of that (since I know a priori that's how the
> database "dirties" the file), and then I scale that up a bit based on
> database size, to get a reasonable tradeoff between block overhead and
> extra transfer upon a change detection.
>
> -- David
>
> /---\
>  \   David Bolen\   E-mail: [EMAIL PROTECTED]  /
>   | FitLinxx, Inc.\  Phone: (203) 708-5192|
>  /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150 \
> \---/

--
@  Remi LAPORTE  @
@ TEXAS INSTRUMENTS UNIX SUPPORT @
@[EMAIL PROTECTED]@

RE: problems encountered in 2.4.6

2001-05-30 Thread Wilson, Mark - MST




>I haven't used the --bwlimit option and don't really know how it works.
>I remember when somebody contributed it that I was skeptical about how
>well it could work.  I'm especially not surprised that it has no impact
>on local-to-local transfers.

I have used --bwlimit and it works a treat. It is in a specialised situation
where I am using 20 simultaneous rsyncs (don't ask...) where each rsync is
limited to 128 Kbyte/sec. The 30Mbit/sec link I am sending over sits exactly
at 20Mbit/sec, which is what I wanted.

Cheers

Mark


CAUTION - This message may contain privileged and confidential 
information intended only for the use of the addressee named above.
If you are not the intended recipient of this message you are hereby 
notified that any use, dissemination, distribution or reproduction 
of this message is prohibited. If you have received this message in 
error please notify Air New Zealand immediately. Any views expressed 
in this message are those of the individual sender and may not 
necessarily reflect the views of Air New Zealand.
_
For more information on the Air New Zealand Group, visit us online
at http://www.airnewzealand.com or http://www.ansett.com.au
_

Re: problems encountered in 2.4.6

2001-05-30 Thread Dave Dykstra

On Tue, May 29, 2001 at 12:02:41PM -0500, Phil Howard wrote:
> Dave Dykstra wrote:
> 
> > On Fri, May 25, 2001 at 02:19:59PM -0500, Dave Dykstra wrote:
> > ...
> > > Use the -W option to disable the rsync algorithm.  We really ought to make
> > > that the default when both the source and destination are local.
> > 
> > I went ahead and submitted a change to the rsync CVS to automatically turn
> > on -W when the source and destination are both on the local machine.
> 
> So how do I revert that on the command line?
> 
> I've been trying with -W doing my disk to disk backups, and I've had
> to go back to not using -W.  Will -c do that?

There's currently no way to revert it.  I thought it wouldn't be necessary,
and I'm not sure how to do it cleanly, because there's currently no precedent
in rsync for a general undoing of options that have different defaults 
depending on the situation.  Another one that comes to mind is --block-io.
The latest rsync in CVS is now using the "popt" package to process 
options intead of getopt.  Does anybody know if that package has a standard
way to negate options, for example prefixing a "no" (like --no-block-io) or
something like that?  I took a quick look through the man page and it
wasn't obvious.

> The reason is the load
> on the machine gets so high, nothing else can run.  This is not CPU
> load, but rather, buffering/swapping load.  CPU load just slows other
> things down.  But buffering/swapping load brings other things to a
> grinding halt.  I suspect Linux's tendency to want to keep everything
> that anything writes in RAM, even if that means swapping out all other
> processes, is impacted by this.  So I'll need a way to not have the
> effect of -W to use rsync for disk to disk backups.

Wow.  Rsync is just going too fast for it I guess.  The -W makes it do
a lot of unnecessary disk I/O which must be enough to throttle its
progress.  Sure seems like leaving out -W is the wrong solution.  Maybe
-W has to turn off more of rsync's pipelining since it is no longer
performing the rsync algorithm.

> The fact that rsync loads so much into VM probably makes the problem
> a bit worse in this case.  I saw 1 process at 35M and 2 processes at
> 70M (total 175M used by rsync, in addition to all the buffered writes).

Does -W have an impact on that?  I would think that if anything -W would
lessen that effect.

> I'm wondering if rsync is even a good choice for disk to disk backup
> duty.  Is there some option I missed that disables pre-loading all
> the file names into memory?

Maybe it isn't.  There is no such option.

> I also tried the --bwlimit option and it had no effect, not even on
> the usual download syncronizing over a dialup that I do.  I could
> not get it to pace the rate below the dialup speed no matter what
> I would specify.

I haven't used the --bwlimit option and don't really know how it works.
I remember when somebody contributed it that I was skeptical about how
well it could work.  I'm especially not surprised that it has no impact
on local-to-local transfers.

- Dave Dykstra

Re: problems encountered in 2.4.6

2001-05-29 Thread Phil Howard

Dave Dykstra wrote:

> On Fri, May 25, 2001 at 02:19:59PM -0500, Dave Dykstra wrote:
> ...
> > Use the -W option to disable the rsync algorithm.  We really ought to make
> > that the default when both the source and destination are local.
> 
> I went ahead and submitted a change to the rsync CVS to automatically turn
> on -W when the source and destination are both on the local machine.

So how do I revert that on the command line?

I've been trying with -W doing my disk to disk backups, and I've had
to go back to not using -W.  Will -c do that?  The reason is the load
on the machine gets so high, nothing else can run.  This is not CPU
load, but rather, buffering/swapping load.  CPU load just slows other
things down.  But buffering/swapping load brings other things to a
grinding halt.  I suspect Linux's tendency to want to keep everything
that anything writes in RAM, even if that means swapping out all other
processes, is impacted by this.  So I'll need a way to not have the
effect of -W to use rsync for disk to disk backups.

The fact that rsync loads so much into VM probably makes the problem
a bit worse in this case.  I saw 1 process at 35M and 2 processes at
70M (total 175M used by rsync, in addition to all the buffered writes).
I'm wondering if rsync is even a good choice for disk to disk backup
duty.  Is there some option I missed that disables pre-loading all
the file names into memory?

I also tried the --bwlimit option and it had no effect, not even on
the usual download syncronizing over a dialup that I do.  I could
not get it to pace the rate below the dialup speed no matter what
I would specify.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://phil.ipal.org/ |
-

Re: problems encountered in 2.4.6

2001-05-29 Thread Dave Dykstra

On Fri, May 25, 2001 at 02:19:59PM -0500, Dave Dykstra wrote:
...
> Use the -W option to disable the rsync algorithm.  We really ought to make
> that the default when both the source and destination are local.

I went ahead and submitted a change to the rsync CVS to automatically turn
on -W when the source and destination are both on the local machine.

- Dave Dykstra

rsync 3 (was Re: problems encountered in 2.4.6)

2001-05-28 Thread John N S Gill


 
> There is a feature I would like, and I notice that even with -c this
> does not happen, but I think it could based on the way rsync works.
> What I'd like to have is when a whole file is moved from one directory
> to another, rsync would detect a new file with the same checksum as an
> existing (potentially to be deleted) file, and copy, move, or link, as
> appropriate.  In theory this should apply to anything anywhere in the
> whole file tree being processed.

See the note I posted on May 17th, title is "Storing updates" and it
includes a tcl script i run on rsync -n output to spot obvious renames
of files and gzip'ing + takes evasive action.  

It would be excellent if rsync could do this sort of thing for me.  The
basic principle is that if you are using --delete then when a file is
missing a good place to look is in the list of deletions.

I spoke to Rusty Russell last November when he was visiting Dublin and
he mentioned there had been some thinking about an "rsync 3".  One
feature being considered was allowing users to supply arbitrary rules
for what to do when a file is missing, based on file suffix etc.  Did
anyone follow up these ideas?

John

RE: problems encountered in 2.4.6

2001-05-25 Thread David Bolen

[EMAIL PROTECTED] [[EMAIL PROTECTED]] writes:

> Actually, the lack of -W isn't helping me at all.  The reason is that
> even for the stuff I do over the network, 99% of it is compressed with
> gzip or bzip2.  If the files change, the originals were changed and a
> new compression is made, and usually most of the file is different.

Just to clarify, when you say "over the network" you mean in true
client/server rsync (or across an rsh/ssh stream) and not just using
one rsync with references using network mount points, right?  In the
latter case, not having -W is hurting you, never helping.

But yes, any format (e.g., encryption, compression) that effectively
distributes changes randomly over a file is going to be a killer for
rsync.

For the case of gzip'd files when a client and server rsync are in
use, you may want to look back through the archives of this list -
there was a reference to a patch for the gzip sources that created
rsync-friendly gzip's.  Not as great as the non-gzip'd version, but
far better than normal gzip.

Ah yes - here was the URL:

http://antarctica.penguincomputing.com/~netfilter/diary/gzip.rsync.patch2

At the time when I tried it (1/2001), here were some test results:

For comparison, here's a database file (delta between one day and the
next), both uncompressed and gzip'd (normal and -9).  For the
uncompressed I also transferred with a fixed 1K blocksize since I know
that's the page size for the database - the others are default
computations (I tried the 1K with the gzip'd version but it was
worse, as expected).

Normal Normal+1Kgzip   gzip-9   
Size54206464   54206464 21867539   21845091
Wrote29021821011490  31698643214740
Read   60176 31764860350  60290
Total29623581329138  32302143275030

Speedup18.30  40.786.77   6.67
Compression 1.00   1.002.479  2.481
Normalized 18.30  40.78   16.78  16.54

And in terms of size:

As Rusty's page comments, they are slightly larger, but not
tremendously so.  In my one case:

Normal gzip:21627629
gzip --rsyncable:   21867539
gzip -9 --rsyncable:21845091

So about a 1-1.1% hit in compressed size.

Personally, here we end up just leaving the major stuff we transfer
uncompressed - as we're using slow analog lines, the cost recovery was
easily worth the cost in disk space, particularly in cases like our
databases where knowledge of the page size and method of change goes a
long way.

> It definitely helped for transferring ISO images where the whole image
> would be changed if some files changed.  I set the chunk size to 2048
> for that.  Why it defaults to 700 seems odd to me.

Not sure - perhaps some early empirical work.  When I'm moving files
that I know something about I definitely control the block size
myself, so for example, when moving databases with a 1K page size, I
always use a multiple of that (since I know a priori that's how the
database "dirties" the file), and then I scale that up a bit based on
database size, to get a reasonable tradeoff between block overhead and
extra transfer upon a change detection.

-- David

/---\
 \   David Bolen\   E-mail: [EMAIL PROTECTED]  /
  | FitLinxx, Inc.\  Phone: (203) 708-5192|
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150 \
\---/

Re: problems encountered in 2.4.6

2001-05-25 Thread Phil Howard

David Bolen wrote:

> The discovery phase will by default just check timestamps and sizes.
> You can adjust that with command line options, including the use of -c
> to include a full file checksum as part of the comparison, if for
> example, files might change without affecting timestamp or size.
> 
> Once rsync knows what it needs to transfer, then it works its way
> through the file list, and for each file it performs a transfer.  By
> default, that transfer is the rsync protocol - which involves the full
> process of dividing the file into chunks with both a strong and
> rolling checksum, and doing the computations to figure out what parts
> to send and so on.

This is where the docs were a bit confusing.  There was no clear
distinction of checksum types related to the -c option.  This implied
to me that w/o -c there would be no checksum at all, and what I thought
the behaviour would be was what I now understand it to be with -W.

> That's why the -W option is really the only logical thing to use with
> a single rsync and "local" (on-system or network share/mount) copies.
> Under such circumstances, the rsync protocol isn't going to help at
> all, and will probably slow things down and take more memory instead.
> With -W rsync becomes an intelligent copier (in terms of figuring out
> what changed), but that's about it.

Actually, the lack of -W isn't helping me at all.  The reason is that
even for the stuff I do over the network, 99% of it is compressed with
gzip or bzip2.  If the files change, the originals were changed and a
new compression is made, and usually most of the file is different.

It definitely helped for transferring ISO images where the whole image
would be changed if some files changed.  I set the chunk size to 2048
for that.  Why it defaults to 700 seems odd to me.

There is a feature I would like, and I notice that even with -c this
does not happen, but I think it could based on the way rsync works.
What I'd like to have is when a whole file is moved from one directory
to another, rsync would detect a new file with the same checksum as an
existing (potentially to be deleted) file, and copy, move, or link, as
appropriate.  In theory this should apply to anything anywhere in the
whole file tree being processed.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://phil.ipal.org/ |
-

RE: problems encountered in 2.4.6

2001-05-25 Thread David Bolen

[EMAIL PROTECTED] [[EMAIL PROTECTED]] writes:

>Dave Dykstra wrote:
>
>> That's two different kinds of checksums.  The -c option runs a whole-file
>> checksum on both sides, but if you don't use -W the rsync rolling
checksum
>> will be applied.
>
>So the chunk-by-chunk checksum always is used w/o -W?  I guess the docs are
>more confusing than I originally thought.

It might help if you think of it as two phases - discovery of what
files need to be transferred, and then the transfer itself.

The discovery phase will by default just check timestamps and sizes.
You can adjust that with command line options, including the use of -c
to include a full file checksum as part of the comparison, if for
example, files might change without affecting timestamp or size.

Once rsync knows what it needs to transfer, then it works its way
through the file list, and for each file it performs a transfer.  By
default, that transfer is the rsync protocol - which involves the full
process of dividing the file into chunks with both a strong and
rolling checksum, and doing the computations to figure out what parts
to send and so on.

Now, normally this process is divided so that the copy of rsync that
does the I/O is local to the file - e.g., for discovery both client
and server rsync identify file timestamp/sizes independently (and
optionally compute the checksums locally) and then exchange that
information.  For transfer both rsyncs build up the rolling and chunk
checksums and exchange them and then decide what file data to send.

But when you are copying with a single rsync (and in particular when
one of the files is on the network), then that rsync has to do all the
work.  That means that during discovery it either 'stat's all files or
optionally computes checksums.  To do the checksum it has to read the
file, so both source and destination get read fully - if either are
on the network you will have already spent the network traffic to pull
the complete files back to the local machine.

Likewise for the transfer - under the rsync protocol, rsync has to
compute the checksums for both source and destination files.  Now,
it'll only do this for those that it wants to transfer, but in those
cases it effectively pulls back complete files from the network just
to compute the checksums, only to then start transferring them.  Even
if the rsync protocol yields a very small amount of difference,
anything beyond that point is already more than the full file with
respect to the network activity that takes place.

That's why the -W option is really the only logical thing to use with
a single rsync and "local" (on-system or network share/mount) copies.
Under such circumstances, the rsync protocol isn't going to help at
all, and will probably slow things down and take more memory instead.
With -W rsync becomes an intelligent copier (in terms of figuring out
what changed), but that's about it.

-- David

/---\
 \   David Bolen\   E-mail: [EMAIL PROTECTED]  /
  | FitLinxx, Inc.\  Phone: (203) 708-5192|
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150 \
\---/

Re: problems encountered in 2.4.6

2001-05-25 Thread Dave Dykstra


On Fri, May 25, 2001 at 04:33:28PM -0500, Phil Howard wrote:
> Dave Dykstra wrote:
> > > One possibility here is that I do have /var/run symlinked to /ram/run
> > > which is on a ramdisk.  So the lock file is there.  The file is there
> > > but it is empty.  Should it have data in it?  BTW, it was in ramdisk
> > > in 2.4.4 and this max connections problem did not exist, so if there
> > > is a ramdisk sensitivity, it's new since 2.4.4.
> > 
> > I don't know if it will show up with data in it or not, I've never tried it.
> > You'll probably need to do some straces.
> 
> Where is the count of number of current connections supposed to be kept?
> It's obviously not actually being kept in this file, at least not when on
> a ramdisk.  But if it's supposed to be, that's the problem.  OTOH, it is
> easy to get the count out of sync this way, too.  If a process is killed
> or otherwise just dies, the count is higher than real.  When I do multi-
> process servers with controlled process counts, I like to have the parent
> track the number of children running.  Of course that precludes using inetd.

It locks different ranges of bytes of the file rather than keeping a count in
it.  I guess the idea with that is if a process dies the operating system
will automatically remove the lock.

- Dave Dykstra

Re: problems encountered in 2.4.6

2001-05-25 Thread Dave Dykstra


On Fri, May 25, 2001 at 03:39:31PM -0500, Phil Howard wrote:
> Dave Dykstra wrote:
> 
> > > 2 =
> > > When syncronizing a very large number of files, all files in a large
> > > partition, rsync frequently hangs.  It's about 50% of the time, but
> > > seems to be a function of how much work there was to be done.  That
> > > is, if I run it soon after it just ran, it tends to not hang, but if
> > > I run it after quite some time (and lots of stuff to syncronize) it
> > > tends to hang.  It appears to have completed all the files, but I
> > > don't get any stats.  There are 3 rsync processes sitting idle with
> > > no files open in the source or target trees.
> > > 
> > > At last count there were 368827 files and 8083 symlinks in 21749
> > > directories.
> > > 
> > > df shows:
> > > /dev/hda4 42188460  38303916   3884544  91% /home
> > > /dev/hdb4 42188460  38301972   3886488  91% /mnt/hdb/home
> > > 
> > > df -i shows:
> > > /dev/hda42662400  398419 2263981   15% /home
> > > /dev/hdb42662400  398462 2263938   15% /mnt/hdb/home
> > > 
> > > The df numbers are not exact because change is constantly happening
> > > on this active server.  Drives hda and hdb are identical and are
> > > partitioned alike.
> > > 
> > > The command line is echoed from the script that runs it:
> > > 
> > > rsync -axv --stats --delete /home/. /mnt/hdb/home/.  
>1>'/home/root/backup-hda-to-hdb/home.log' 2>&1
> > 
> > 
> > Use the -W option to disable the rsync algorithm.  We really ought to make
> > that the default when both the source and destination are local.
> 
> I don't want to copy everything every time.  That's why I am using
> rsync to do this in the first place.  I don't understand why this
> would be what's hanging.


I'm talking about the per-file changes.  Even with -W it will only copy the
whole files that changed.  However, it will copy whole files rather than
pieces of files.  This is turns out to be much faster when you're mounting
a remote filesystem than trying to go through the per-file rsync algorithm
because that trades off extra "local" disk access to save bandwidth between
the two machine endpoints.  If you would run to hdb:/home rather than
/mnt/hdb/home that would make a big difference.


> > > A deadly embrace?  It seems possible.
> > 
> > 
> > No, the receiving side of an rsync transaction splits itself into two
> > processes for the sake of pipelining: one to generate checksums and one to
> > accept updates.  When you're sending and receiving to the same machine then
> > you've got one sender and 2 receivers.
> 
> Right.  But what I was suggesting was a deadly embrace in that the
> process killed was waiting for something, and the parent was waiting
> for something.
> 
> I'm not using the "c" option, so why would checksum be generated?

That's two different kinds of checksums.  The -c option runs a whole-file
checksum on both sides, but if you don't use -W the rsync rolling checksum
will be applied.


> > > I'm also curious why 26704 has no fd 1.
> > 
> > I don't know.  When I tried it all 3 processes had an fd 1.
> 
> Were you looking at it after it hung?  Or is it not hanging for you?

It didn't hang for me; I didn't try it over a remote filesystem mount.


> I am curious if the lack of fd 1 is related to the hang.  It is being
> started with 1> and 2> redirected to a log file _and_ the whole thing
> is being run via the "script" command for a "big picture" logfile.
> It was set up this way with the intent to run it from cron, although
> I haven't actually added it to crontab, yet, due to the problems.

I doubt it.


> > > 3 =
> > > @ERROR: max connections (16) reached - try again later
> > > 
> > > This occurs after just one connection is active.  It behaves as if
> > > I had specified "max connections = 1".  On another server I set it
> > > to 40, and it showed:
> > > 
> > > @ERROR: max connections (40) reached - try again later
> > > 
> > > so it obvious is parsing and keeping the value I configure, but it
> > > isn't using it correctly.
> > > 
> > > Also, if I ^C the client, then I get this error every time until I
> > > restart the daemon (running in standalone daemon mode, not inetd).
> > > So it seems like it counts clients wrong.  But I can't get more
> > > that 1 right after restarting the server, so it's a little more
> > > than that somewhere.
> > 
> > I don't know, I never used max connections.  Could indeed be a bug.
> > The code looks pretty tricky.  It's trying to lock pieces of the file
> > /var/run/rsyncd.lock in order for independent processes to coordinate. 
> > Are you running as root (the lsof above suggests you are)?  If not, you
> > probably need to specify another file that your daemon has access to in the
> > "lock file" option.  Otherwise it would probably help for you to run some

Re: problems encountered in 2.4.6

2001-05-25 Thread Phil Howard


Dave Dykstra wrote:

> > 2 =
> > When syncronizing a very large number of files, all files in a large
> > partition, rsync frequently hangs.  It's about 50% of the time, but
> > seems to be a function of how much work there was to be done.  That
> > is, if I run it soon after it just ran, it tends to not hang, but if
> > I run it after quite some time (and lots of stuff to syncronize) it
> > tends to hang.  It appears to have completed all the files, but I
> > don't get any stats.  There are 3 rsync processes sitting idle with
> > no files open in the source or target trees.
> > 
> > At last count there were 368827 files and 8083 symlinks in 21749
> > directories.
> > 
> > df shows:
> > /dev/hda4 42188460  38303916   3884544  91% /home
> > /dev/hdb4 42188460  38301972   3886488  91% /mnt/hdb/home
> > 
> > df -i shows:
> > /dev/hda42662400  398419 2263981   15% /home
> > /dev/hdb42662400  398462 2263938   15% /mnt/hdb/home
> > 
> > The df numbers are not exact because change is constantly happening
> > on this active server.  Drives hda and hdb are identical and are
> > partitioned alike.
> > 
> > The command line is echoed from the script that runs it:
> > 
> > rsync -axv --stats --delete /home/. /mnt/hdb/home/.  
>1>'/home/root/backup-hda-to-hdb/home.log' 2>&1
> 
> 
> Use the -W option to disable the rsync algorithm.  We really ought to make
> that the default when both the source and destination are local.

I don't want to copy everything every time.  That's why I am using
rsync to do this in the first place.  I don't understand why this
would be what's hanging.

> > A deadly embrace?  It seems possible.
> 
> 
> No, the receiving side of an rsync transaction splits itself into two
> processes for the sake of pipelining: one to generate checksums and one to
> accept updates.  When you're sending and receiving to the same machine then
> you've got one sender and 2 receivers.

Right.  But what I was suggesting was a deadly embrace in that the
process killed was waiting for something, and the parent was waiting
for something.

I'm not using the "c" option, so why would checksum be generated?

> > I'm also curious why 26704 has no fd 1.
> 
> I don't know.  When I tried it all 3 processes had an fd 1.

Were you looking at it after it hung?  Or is it not hanging for you?
I am curious if the lack of fd 1 is related to the hang.  It is being
started with 1> and 2> redirected to a log file _and_ the whole thing
is being run via the "script" command for a "big picture" logfile.
It was set up this way with the intent to run it from cron, although
I haven't actually added it to crontab, yet, due to the problems.


> > 3 =
> > @ERROR: max connections (16) reached - try again later
> > 
> > This occurs after just one connection is active.  It behaves as if
> > I had specified "max connections = 1".  On another server I set it
> > to 40, and it showed:
> > 
> > @ERROR: max connections (40) reached - try again later
> > 
> > so it obvious is parsing and keeping the value I configure, but it
> > isn't using it correctly.
> > 
> > Also, if I ^C the client, then I get this error every time until I
> > restart the daemon (running in standalone daemon mode, not inetd).
> > So it seems like it counts clients wrong.  But I can't get more
> > that 1 right after restarting the server, so it's a little more
> > than that somewhere.
> 
> I don't know, I never used max connections.  Could indeed be a bug.
> The code looks pretty tricky.  It's trying to lock pieces of the file
> /var/run/rsyncd.lock in order for independent processes to coordinate. 
> Are you running as root (the lsof above suggests you are)?  If not, you
> probably need to specify another file that your daemon has access to in the
> "lock file" option.  Otherwise it would probably help for you to run some
> straces.

I would have presumed since there was a daemon process running
(as opposed to running from inetd) that the daemon itself could
simply track the connection count.

One possibility here is that I do have /var/run symlinked to /ram/run
which is on a ramdisk.  So the lock file is there.  The file is there
but it is empty.  Should it have data in it?  BTW, it was in ramdisk
in 2.4.4 and this max connections problem did not exist, so if there
is a ramdisk sensitivity, it's new since 2.4.4.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://phil.ipal.org/ |
-

Re: problems encountered in 2.4.6

2001-05-25 Thread Dave Dykstra


On Fri, May 25, 2001 at 12:14:17PM -0500, Phil Howard wrote:
> I switched to 2.4.6 a while back, but have only been making heavy
> use of rsync the past couple of months, and have been running into
> a few problems that may be bugs.  I looked at the bug tracker, but
> it was too cumbersome to use effectively.  I don't know if these
> are real bugs or just configuration mistakes.  Maybe you can tell
> me.
> 
> The host OS is Linux 2.4.X (X is 0 on some and 2 on others) and
> Slackware 7.1 in all cases.
> 
> Here are the things I'm running into that I did not have in 2.4.4:
> 
> 1 =
> Write failed: Cannot allocate memory
> unexpected EOF in read_timeout
> unexpected EOF in read_timeout
> 
> I've seen this happen when there was over 256 meg available space
> between ram and swap, so it should not have failed as a result of
> not being able to get a reasonable amount from the system.  This
> also occurs randomly; if I wipe the files back out and run it all
> again, it often does not happen the next time, or if it does, it
> is not at the same point.  Also, the number of files being copied
> is smallish (not more than 100), and it happens even when no file
> is larger than about 4 meg and the total transfer is no larger than
> 20 meg (so even if it pre-loaded every file into ram, there should
> be enough space).  The happens even if the target directory starts
> empty.
> 
> This is done through ssh and I do not recall it happening when using
> an rsync daemon.


This must be coming from ssh, not rsync.  There is no string "Write failed"
in the rsync source code, but there is in both ssh 1.2.27 and openssh 2.9p1.



> 2 =
> When syncronizing a very large number of files, all files in a large
> partition, rsync frequently hangs.  It's about 50% of the time, but
> seems to be a function of how much work there was to be done.  That
> is, if I run it soon after it just ran, it tends to not hang, but if
> I run it after quite some time (and lots of stuff to syncronize) it
> tends to hang.  It appears to have completed all the files, but I
> don't get any stats.  There are 3 rsync processes sitting idle with
> no files open in the source or target trees.
> 
> At last count there were 368827 files and 8083 symlinks in 21749
> directories.
> 
> df shows:
> /dev/hda4 42188460  38303916   3884544  91% /home
> /dev/hdb4 42188460  38301972   3886488  91% /mnt/hdb/home
> 
> df -i shows:
> /dev/hda42662400  398419 2263981   15% /home
> /dev/hdb42662400  398462 2263938   15% /mnt/hdb/home
> 
> The df numbers are not exact because change is constantly happening
> on this active server.  Drives hda and hdb are identical and are
> partitioned alike.
> 
> The command line is echoed from the script that runs it:
> 
> rsync -axv --stats --delete /home/. /mnt/hdb/home/.  
>1>'/home/root/backup-hda-to-hdb/home.log' 2>&1


Use the -W option to disable the rsync algorithm.  We really ought to make
that the default when both the source and destination are local.


> The log file shows a file list gone all the way to the last file
> and lsof done after the hang shows:
> 
> rsync 26651root  cwdDIR3,2 4096 24 /root
> rsync 26651root  rtdDIR3,2 4096  2 /
> rsync 26651root  txtREG   3,10   187443   8758 
>/usr/local/bin/rsync
> rsync 26651root  memREG3,279276   4239 /lib/ld-2.1.3.so
> rsync 26651root  memREG3,2  1013224   4249 /lib/libc-2.1.3.so
> rsync 26651root  memREG3,240360   4274 
>/lib/libnss_compat-2.1.3.so
> rsync 26651root  memREG3,275500   4272 
>/lib/libnsl-2.1.3.so
> rsync 26651root0u   CHR 136,14  16 /dev/pts/14
> rsync 26651root1w   REG3,4568981778435 
>/home/root/backup-hda-to-hdb/home.log
> rsync 26651root2w   REG3,4568981778435 
>/home/root/backup-hda-to-hdb/home.log
> rsync 26651root4u  unix 0xcb0f4040   135813770 socket
> rsync 26651root5u  unix 0xcb0f4cc0   135813771 socket
> rsync 26652root  cwdDIR   3,68 4096  2 /mnt/hdb/home
> rsync 26652root  rtdDIR3,2 4096  2 /
> rsync 26652root  txtREG   3,10   187443   8758 
>/usr/local/bin/rsync
> rsync 26652root  memREG3,279276   4239 /lib/ld-2.1.3.so
> rsync 26652root  memREG3,2  1013224   4249 /lib/libc-2.1.3.so
> rsync 26652root  memREG3,240360   4274 
>/lib/libnss_compat-2.1.3.so
> rsync 26652root  memREG3,275500   4272 
>/lib/libnsl-2.1.3.so
> rsync 26652root1u  unix

problems encountered in 2.4.6

2001-05-25 Thread Phil Howard


I switched to 2.4.6 a while back, but have only been making heavy
use of rsync the past couple of months, and have been running into
a few problems that may be bugs.  I looked at the bug tracker, but
it was too cumbersome to use effectively.  I don't know if these
are real bugs or just configuration mistakes.  Maybe you can tell
me.

The host OS is Linux 2.4.X (X is 0 on some and 2 on others) and
Slackware 7.1 in all cases.

Here are the things I'm running into that I did not have in 2.4.4:

1 =
Write failed: Cannot allocate memory
unexpected EOF in read_timeout
unexpected EOF in read_timeout

I've seen this happen when there was over 256 meg available space
between ram and swap, so it should not have failed as a result of
not being able to get a reasonable amount from the system.  This
also occurs randomly; if I wipe the files back out and run it all
again, it often does not happen the next time, or if it does, it
is not at the same point.  Also, the number of files being copied
is smallish (not more than 100), and it happens even when no file
is larger than about 4 meg and the total transfer is no larger than
20 meg (so even if it pre-loaded every file into ram, there should
be enough space).  The happens even if the target directory starts
empty.

This is done through ssh and I do not recall it happening when using
an rsync daemon.

2 =
When syncronizing a very large number of files, all files in a large
partition, rsync frequently hangs.  It's about 50% of the time, but
seems to be a function of how much work there was to be done.  That
is, if I run it soon after it just ran, it tends to not hang, but if
I run it after quite some time (and lots of stuff to syncronize) it
tends to hang.  It appears to have completed all the files, but I
don't get any stats.  There are 3 rsync processes sitting idle with
no files open in the source or target trees.

At last count there were 368827 files and 8083 symlinks in 21749
directories.

df shows:
/dev/hda4 42188460  38303916   3884544  91% /home
/dev/hdb4 42188460  38301972   3886488  91% /mnt/hdb/home

df -i shows:
/dev/hda42662400  398419 2263981   15% /home
/dev/hdb42662400  398462 2263938   15% /mnt/hdb/home

The df numbers are not exact because change is constantly happening
on this active server.  Drives hda and hdb are identical and are
partitioned alike.

The command line is echoed from the script that runs it:

rsync -axv --stats --delete /home/. /mnt/hdb/home/.  
1>'/home/root/backup-hda-to-hdb/home.log' 2>&1

The log file shows a file list gone all the way to the last file
and lsof done after the hang shows:

rsync 26651root  cwdDIR3,2 4096 24 /root
rsync 26651root  rtdDIR3,2 4096  2 /
rsync 26651root  txtREG   3,10   187443   8758 /usr/local/bin/rsync
rsync 26651root  memREG3,279276   4239 /lib/ld-2.1.3.so
rsync 26651root  memREG3,2  1013224   4249 /lib/libc-2.1.3.so
rsync 26651root  memREG3,240360   4274 
/lib/libnss_compat-2.1.3.so
rsync 26651root  memREG3,275500   4272 /lib/libnsl-2.1.3.so
rsync 26651root0u   CHR 136,14  16 /dev/pts/14
rsync 26651root1w   REG3,4568981778435 
/home/root/backup-hda-to-hdb/home.log
rsync 26651root2w   REG3,4568981778435 
/home/root/backup-hda-to-hdb/home.log
rsync 26651root4u  unix 0xcb0f4040   135813770 socket
rsync 26651root5u  unix 0xcb0f4cc0   135813771 socket
rsync 26652root  cwdDIR   3,68 4096  2 /mnt/hdb/home
rsync 26652root  rtdDIR3,2 4096  2 /
rsync 26652root  txtREG   3,10   187443   8758 /usr/local/bin/rsync
rsync 26652root  memREG3,279276   4239 /lib/ld-2.1.3.so
rsync 26652root  memREG3,2  1013224   4249 /lib/libc-2.1.3.so
rsync 26652root  memREG3,240360   4274 
/lib/libnss_compat-2.1.3.so
rsync 26652root  memREG3,275500   4272 /lib/libnsl-2.1.3.so
rsync 26652root1u  unix 0xcb93b9a0   135813772 socket
rsync 26652root2w   REG3,4568981778435 
/home/root/backup-hda-to-hdb/home.log
rsync 26652root3u  unix 0xca8edcc0   135814161 socket
rsync 26652root5u  unix 0xcc9969a0   135814163 socket
rsync 26704root  cwdDIR   3,68 4096  2 /mnt/hdb/home
rsync 26704root  rtdDIR3,2 4096  2 /
rsync 26704root  txtREG   3,10   187443   8758 /usr/local/bin/rsync
rsync

Re: problems encountered in 2.4.6

RE: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

rsync 3 (was Re: problems encountered in 2.4.6)

RE: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

RE: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

Re: problems encountered in 2.4.6

problems encountered in 2.4.6

14 matches

Site Navigation

Mail list logo

Footer information