I am having a problem with rsync on a box. I can't figure out how to fix the problem. The box is running on the REDHAT AS channel (using RPM's I compiled from SRPMS) and the system is "virgin" RedHat RPM's. It is completely up2date except for the fact that it is running an older kernel (it has been up for about 200 days). I have many other boxes that are 100% identical to this problem system that are not having any problems at all.

When logged into BADBOX I type
"rsync -e ssh -r SMALL-DIR [EMAIL PROTECTED]:/tmp"
and it works fine as long as the SMALL-DIR directory has less than 2 dozen files and the files are less than 32KB in size.


As soon as I do "rsync -e ssh -r BIG-DIR [EMAIL PROTECTED]:/tmp"
I get the following output:
erroring writing 32768 bytes - exiting
On the "otherbox" in /tmp there is a BIG-DIR, and almost all of the files are there except for the BIG files.


I ran "strace rsync ..." and it looks like everything goes fine on all the small files, then it gets to the big file "TEST/linux.tar.gz" and
see here:


...

write(1, "sender finished TEST/licence.o\n", 31sender finished TEST/licence.o
) = 31
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "%\0\0\t", 4) = 4
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "recv_generator(TEST/linux.tar.gz"..., 37) = 37
write(1, "recv_generator(TEST/linux.tar.gz"..., 37recv_generator(TEST/linux.tar.gz,19)
) = 37
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "\20\0\0\7", 4) = 4
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "\23\0\0\0", 4) = 4
write(1, "send_files(19,TEST/linux.tar.gz)"..., 33send_files(19,TEST/linux.tar.gz)
) = 33
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "\0\0\0\0", 4) = 4
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "\274\2\0\0", 4) = 4
select(6, [5], NULL, NULL, {60, 0}) = 1 (in [5], left {60, 0})
read(5, "\0\0\0\0", 4) = 4
open("TEST/linux.tar.gz", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0640, st_size=1602359, ...}) = 0
write(1, "send_files mapped TEST/linux.tar"..., 52send_files mapped TEST/linux.tar.gz of size 1602359
) = 52
select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
write(4, "\23\0\0\0", 4) = 4
select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
write(4, "\0\0\0\0", 4) = 4
select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
write(4, "\274\2\0\0", 4) = 4
select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
write(4, "\0\0\0\0", 4) = 4
write(1, "calling match_sums TEST/linux.ta"..., 37calling match_sums TEST/linux.tar.gz
) = 37
write(1, "TEST/linux.tar.gz\n", 18TEST/linux.tar.gz
) = 18
select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
write(4, "\0\200\0\0", 4) = 4
mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x401a0000
read(3, "\37\213\10\10\317`\177=\0\3bp_inst_linux.tar\0\354=Yp"..., 262144) = 262144
select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
write(4, "\37\213\10\10\317`\177=\0\3bp_inst_linux.tar\0\354=Yp"..., 32768) = -1 ENOBUFS (No buffer space available)
write(2, "erroring writing 32768 bytes - e"..., 39erroring writing 32768 bytes - exiting
) = 39
rt_sigaction(SIGUSR1, {SIG_IGN}, {0x8050420, [USR1], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGUSR2, {SIG_IGN}, {0x8050440, [USR2], SA_RESTART|0x4000000}, 8) = 0
getpid() = 20601
kill(20602, SIGUSR1) = 0
--- SIGCHLD (Child exited) ---
munmap(0x40018000, 4096) = 0
_exit(12) = ?
...



I read about "ENOBUFS" which led me to try adjusting the TCP socket buffer sizes.... I tried the following (this box has 2GB of RAM)


/proc/sys/fs/files-nr:  3050    1406    40960
/proc/sys/net/core/rmem_max:  128388607
/proc/sys/net/core/wmem_max:  128388607
/proc/sys/net/ipv4/tcp_adv_win_scale:  2
/proc/sys/net/ipv4/tcp_window_scaling: 1
/proc/sys/net/ipv4/tcp_rmem: 8192 128000 128388607
/proc/sys/net/ipv4/tcp_wmem: 8192 112000 127388607


This box does have 7250 open file descriptors reported by "lsof" But the box isn't that busy, it has a load of only about 0.5


cat /proc/meminfo total: used: free: shared: buffers: cached: Mem: 2108256256 1899868160 208388096 2043904 190640128 1422811136 Swap: 6440263680 185999360 6254264320 MemTotal: 2058844 kB MemFree: 203504 kB MemShared: 1996 kB Buffers: 186172 kB Cached: 1298792 kB SwapCached: 90672 kB Active: 1071248 kB Inact_dirty: 214688 kB Inact_clean: 291696 kB Inact_target: 514692 kB HighTotal: 1179584 kB HighFree: 2036 kB LowTotal: 879260 kB LowFree: 201468 kB SwapTotal: 6289320 kB SwapFree: 6107680 kB BigPagesFree: 0 kB

vmstat 2 3
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
2 0 0 181640 203504 186200 1299104 0 0 1 1 0 0 0 1 0
0 0 0 181640 203504 186200 1299112 0 0 0 0 142 228 24 0 75
2 0 0 181640 203504 186204 1299120 0 0 0 54 184 342 2 0 98




Does anybody have any ideas?

Thanks in advance,
-Ben.



--
redhat-list mailing list
unsubscribe mailto:[EMAIL PROTECTED]
https://www.redhat.com/mailman/listinfo/redhat-list

Reply via email to