** Description changed: [Impact] What the user suffering from this bug experiences is that the big amount of informative messages related to the copy process with the three spawned processes(sender, receiver and generator) exhaust the I/O buffer and the sync gets stuck, either because there are too many files to synchronise and/or because too many detail messages (levels of verbose mode) have been requested in the output. The fix, that comes from upstream and is applied there since version 3.2.0., increments the size of the receiver's I/O buffer. - [Test Plan] This test plan is for Focal, but it's the same for Bionic. 0.Preparing the test environment: #Preparing the container lxc launch images:ubuntu/focal rsync-iobuffer-focal lxc shell rsync-iobuffer-focal apt update -y apt upgrade -y #Installing necessary tools - apt install rsync + apt install rsync apt install wget - #Get test cases from comments #16 and #19 on this LP bug + #Get test cases from comments #16 and #19 on this LP bug: As test case + #16 covers both aspects (a lot of files and upper verbosity) and test + #19 uses a huge tarball (120 Mb), I'm removing from this SRU the + bad/good case for the #19 (but, please feel to check it if you consider + it necessary). cd /tmp/ - #19 - wget https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1528921/+attachment/5211950/+files/html2.tgz - tar -xvzf /tmp/html2.tgz - mkdir /tmp/rsynctest - - #16 + #16 Paste the contents of https://pastebin.com/raw/ctzJJGwt: #!/bin/bash mkdir source_dir pushd source_dir dd if=/dev/zero of=source bs=600K count=1 for i in `seq 1 11500`; do - cp -v source file_$i; + cp -v source file_$i; done rm source for i in `seq 1 10`; do - dd if=/dev/zero of=file_large_$i bs=200M count=1 + dd if=/dev/zero of=file_large_$i bs=200M count=1 done popd echo "Created 11500 files with size 600K and 10 files with size 200M, try the following command:" echo "rsync -avvvz --delete source_dir target_dir" in a new file script_comment16.sh - - chmod +x script_comment16.sh + + chmod +x script_comment16.sh ./script_comment16.sh + #19 (Skip this if you can't download the tarball - Examples of good/bad were removed). + wget https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1528921/+attachment/5211950/+files/html2.tgz + tar -xvzf /tmp/html2.tgz + mkdir /tmp/rsynctest + 1. Bad cases (without and with using strace): - - # Scenario from comment 19 - $ rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ - (Client) Protocol versions: remote=31, negotiated=31 - sending incremental file list - [sender] change_dir(/tmp) - send_files starting - server_recv(2) starting pid=49029 - get_local_name count=7070 /tmp/rsynctest/ - created directory /tmp/rsynctest - [Receiver] change_dir(/tmp/rsynctest) - generator starting pid=49029 - delta-transmission disabled for local transfer or --whole-file - recv_generator(html2,1) - recv_generator(html2,2) - set uid of html2 from 0 to 1000 - set gid of html2 from 0 to 1000 - set modtime of html2 to (1447205118) Wed Nov 11 01:25:18 2015 - [...] - sender finished /tmp/html2/annotator__raw_8h__incl.md5 - send_files(282, /tmp/html2/annotator__raw_8h__incl.png) - html2/annotator__raw_8h__incl.png - - It hangs here, where using strace we can see: - - $ strace rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ - [...] - read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\v\4\0\0\2\233\10\6\0\0\0\361\177\254"..., 262144) = 262144 - select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 2 (in [5], out [4], left {tv_sec=59, tv_usec=999998}) - read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) = 95 - write(4, "K\374\0\7\177\377\207\343\335\345+{W\335{K\371y\211w`Ysl\336B{\312\340}\320\301"..., 64591) = 64591 - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 1 (out [4], left {tv_sec=59, tv_usec=999998}) - write(4, "\336\322\0\7\177\377\255\371\367\215v\321-\224\323+\363\261\243\7\211Do\230\256\257O\372\367:\357O"..., 53986) = 53986 - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) - select(5, [], [4], [], {tv_sec=60, tv_usec=0} - # Scenario from comment 16 $ rsync -avvvz --delete source_dir target_dir sending incremental file list [sender] make_file(source_dir,*,0) send_file_list done [sender] pushing local filters for /root/source_dir/ [sender] make_file(source_dir/file_3048,*,2) [sender] make_file(source_dir/file_11358,*,2) [sender] make_file(source_dir/file_5914,*,2) [sender] make_file(source_dir/file_5880,*,2) [sender] make_file(source_dir/file_9318,*,2) [sender] make_file(source_dir/file_5539,*,2) [...] sending file_sum false_alarms=0 hash_hits=0 matches=0 sender finished source_dir/file_10807 send_files(903, source_dir/file_10808) send_files mapped source_dir/file_10808 of size 614400 calling match_sums source_dir/file_10808 source_dir/file_10808 It hangs here, where using strace we can see: - $ strace rsync -avvvz --delete source_dir target_dir + $ strace rsync -avvvz --delete source_dir target_dir source_dir/file_11280 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 90112) = 90112 select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 1 (in [5], left {tv_sec=59, tv_usec=999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 1900) = 1900 select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) select(5, [], [4], [], {tv_sec=60, tv_usec=0} - 1. Good cases: - - # Scenario from comment 19 - - $ rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ - (Client) Protocol versions: remote=31, negotiated=31 - sending incremental file list - [sender] change_dir(/tmp) - send_files starting - server_recv(2) starting pid=25249 - get_local_name count=7070 /tmp/rsynctest/ - created directory /tmp/rsynctest - [Receiver] change_dir(/tmp/rsynctest) - generator starting pid=25249 - delta-transmission disabled for local transfer or --whole-file - recv_generator(html2,1) - recv_generator(html2,2) - set uid of html2 from 0 to 1000 - set gid of html2 from 0 to 1000 - set modtime of html2 to (1447205118) Wed Nov 11 01:25:18 2015 - recv_generator(html2/ConstellationDisplayPlot_8h.html,3) - recv_generator(html2/ConstellationDisplayPlot_8h__dep__incl.map,4) - recv_generator(html2/ConstellationDisplayPlot_8h__dep__incl.md5,5) - [...] - generate_files finished - - sent 140,246,278 bytes received 4,053,586 bytes 15,189,459.37 bytes/sec - total size is 252,298,375 speedup is 1.75 - [sender] _exit_cleanup(code=0, file=main.c, line=1207): about to call exit(0) - # Scenario from comment 16 $ rsync -avvvz --delete source_dir target_dir sending incremental file list [sender] make_file(source_dir,*,0) send_file_list done [sender] pushing local filters for /tmp/source_dir/ [sender] make_file(source_dir/file_3052,*,2) [sender] make_file(source_dir/file_1766,*,2) [sender] make_file(source_dir/file_10466,*,2) [sender] make_file(source_dir/file_9375,*,2) [sender] make_file(source_dir/file_7260,*,2) [sender] make_file(source_dir/file_5554,*,2) [sender] make_file(source_dir/file_5523,*,2) [sender] make_file(source_dir/file_1685,*,2) [sender] make_file(source_dir/file_7217,*,2) [sender] make_file(source_dir/file_10411,*,2) [...] generate_files finished sent 9,555,678 bytes received 3,599,560 bytes 124,694.20 bytes/sec total size is 9,162,752,000 speedup is 696.51 - [Where problems could occur] Perhaps the buffer size may not be sufficient for an operation involving a very huge amount of data, as reported upstream here ( https://bugzilla.samba.org/show_bug.cgi?id=11166#c9 ): the reporter asked for a parameter to manually change the size of the receiver's buffer I/O for messaging for those cases, but it may lend itself to a memory overflow more easily even than this solution which also has this risk. [Other Info] - - The fix has been picked from upstream here https://github.com/WayneD/rsync/commit/01b9bbb0f94e4ddd4b1e3ad678b220ca09bc0ad0 . I've checked on bugzilla that there aren't bugs related to "out of memory" issues after 3.2.0 release. - + + The fix has been picked from upstream here + https://github.com/WayneD/rsync/commit/01b9bbb0f94e4ddd4b1e3ad678b220ca09bc0ad0 + . I've checked on bugzilla that there aren't bugs related to "out of + memory" issues after 3.2.0 release. [Original Report] ------------------------------------------------------------------- In the last few months my home directory backup stopped completing. I've been able to reproduce the problem on a single subdirectory although I had to add the --debug=all flag to reproduce it on that smaller directory. Specifically, this command never completes: rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ The html2 directory is a copy of gnuradio-3.7.8.1/build/docs/doxygen/html . When I strace the command, I see this: write(1, "sender finished /tmp/html2/atsc_"..., 58sender finished /tmp/html2/atsc__interleaver_8h__incl.md5 ) = 58 write(1, "send_files(338, /tmp/html2/atsc_"..., 59send_files(338, /tmp/html2/atsc__interleaver_8h__incl.png) ) = 59 open("html2/atsc__interleaver_8h__incl.png", O_RDONLY|O_LARGEFILE) = 3 fstat64(3, {st_mode=S_IFREG|0664, st_size=264657, ...}) = 0 write(1, "html2/atsc__interleaver_8h__incl"..., 37html2/atsc__interleaver_8h__incl.png ) = 37 read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\n\253\0\0\2\233\10\6\0\0\0h\242\""..., 262144) = 262144 select(6, [5], [4], [5], {60, 0}) = 2 (in [5], out [4], left {59, 999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) = 95 write(4, "r\311\0\7\177\377\232\237\264\272e\300\300\240\316\264&\314\301\252*\37\256y\225g\373^\315j\370\350"..., 51574) = 51574 select(5, [], [4], [], {60, 0}) = 1 (out [4], left {59, 999997}) write(4, "\7\320\0\7\177\377\234|\7X\223Y\273\255c\27\25f\306\212\202\214#E\272\212t\1\225A\fU"..., 53259) = 53259 select(5, [], [4], [], {60, 0} The select command times out over and over. I get the same behavior when trying to back up my entire home directory but I don't need the --debug=all flag in that case. lsb_release -rd Description: Ubuntu 14.04.3 LTS Release: 14.04 apt-cache policy rsync rsync: Installed: 3.1.0-2ubuntu0.1 Candidate: 3.1.0-2ubuntu0.1 Version table: *** 3.1.0-2ubuntu0.1 0 500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 Packages 500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 Packages 100 /var/lib/dpkg/status 3.1.0-2 0 500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: rsync 3.1.0-2ubuntu0.1 ProcVersionSignature: Ubuntu 3.13.0-74.118-generic 3.13.11-ckt30 Uname: Linux 3.13.0-74-generic i686 NonfreeKernelModules: nvidia ApportVersion: 2.14.1-0ubuntu3.19 Architecture: i386 CurrentDesktop: KDE Date: Wed Dec 23 09:44:17 2015 EcryptfsInUse: Yes InstallationDate: Installed on 2010-09-18 (1922 days ago) InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta i386 (20100901.1) SourcePackage: rsync UpgradeStatus: Upgraded to trusty on 2014-12-27 (361 days ago)
** Description changed: [Impact] What the user suffering from this bug experiences is that the big amount of informative messages related to the copy process with the three spawned processes(sender, receiver and generator) exhaust the I/O buffer and the sync gets stuck, either because there are too many files to synchronise and/or because too many detail messages (levels of verbose mode) have been requested in the output. The fix, that comes from upstream and is applied there since version 3.2.0., increments the size of the receiver's I/O buffer. [Test Plan] This test plan is for Focal, but it's the same for Bionic. 0.Preparing the test environment: #Preparing the container lxc launch images:ubuntu/focal rsync-iobuffer-focal lxc shell rsync-iobuffer-focal apt update -y apt upgrade -y #Installing necessary tools apt install rsync - apt install wget #Get test cases from comments #16 and #19 on this LP bug: As test case #16 covers both aspects (a lot of files and upper verbosity) and test - #19 uses a huge tarball (120 Mb), I'm removing from this SRU the - bad/good case for the #19 (but, please feel to check it if you consider - it necessary). + #19 uses a huge tarball (120 Mb), I'm removing from this SRU the #19 + scenario (but, please feel to reach me it if you consider it necessary + and I'll provide the steps and bad/good scenarios). cd /tmp/ #16 Paste the contents of https://pastebin.com/raw/ctzJJGwt: #!/bin/bash mkdir source_dir pushd source_dir dd if=/dev/zero of=source bs=600K count=1 for i in `seq 1 11500`; do cp -v source file_$i; done rm source for i in `seq 1 10`; do dd if=/dev/zero of=file_large_$i bs=200M count=1 done popd echo "Created 11500 files with size 600K and 10 files with size 200M, try the following command:" echo "rsync -avvvz --delete source_dir target_dir" in a new file script_comment16.sh chmod +x script_comment16.sh ./script_comment16.sh - - #19 (Skip this if you can't download the tarball - Examples of good/bad were removed). - wget https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1528921/+attachment/5211950/+files/html2.tgz - tar -xvzf /tmp/html2.tgz - mkdir /tmp/rsynctest 1. Bad cases (without and with using strace): # Scenario from comment 16 $ rsync -avvvz --delete source_dir target_dir sending incremental file list [sender] make_file(source_dir,*,0) send_file_list done [sender] pushing local filters for /root/source_dir/ [sender] make_file(source_dir/file_3048,*,2) [sender] make_file(source_dir/file_11358,*,2) [sender] make_file(source_dir/file_5914,*,2) [sender] make_file(source_dir/file_5880,*,2) [sender] make_file(source_dir/file_9318,*,2) [sender] make_file(source_dir/file_5539,*,2) [...] sending file_sum false_alarms=0 hash_hits=0 matches=0 sender finished source_dir/file_10807 send_files(903, source_dir/file_10808) send_files mapped source_dir/file_10808 of size 614400 calling match_sums source_dir/file_10808 source_dir/file_10808 It hangs here, where using strace we can see: $ strace rsync -avvvz --delete source_dir target_dir source_dir/file_11280 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 90112) = 90112 select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 1 (in [5], left {tv_sec=59, tv_usec=999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 1900) = 1900 select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) select(5, [], [4], [], {tv_sec=60, tv_usec=0} 1. Good cases: # Scenario from comment 16 $ rsync -avvvz --delete source_dir target_dir sending incremental file list [sender] make_file(source_dir,*,0) send_file_list done [sender] pushing local filters for /tmp/source_dir/ [sender] make_file(source_dir/file_3052,*,2) [sender] make_file(source_dir/file_1766,*,2) [sender] make_file(source_dir/file_10466,*,2) [sender] make_file(source_dir/file_9375,*,2) [sender] make_file(source_dir/file_7260,*,2) [sender] make_file(source_dir/file_5554,*,2) [sender] make_file(source_dir/file_5523,*,2) [sender] make_file(source_dir/file_1685,*,2) [sender] make_file(source_dir/file_7217,*,2) [sender] make_file(source_dir/file_10411,*,2) [...] generate_files finished sent 9,555,678 bytes received 3,599,560 bytes 124,694.20 bytes/sec total size is 9,162,752,000 speedup is 696.51 [Where problems could occur] Perhaps the buffer size may not be sufficient for an operation involving a very huge amount of data, as reported upstream here ( https://bugzilla.samba.org/show_bug.cgi?id=11166#c9 ): the reporter asked for a parameter to manually change the size of the receiver's buffer I/O for messaging for those cases, but it may lend itself to a memory overflow more easily even than this solution which also has this risk. [Other Info] The fix has been picked from upstream here https://github.com/WayneD/rsync/commit/01b9bbb0f94e4ddd4b1e3ad678b220ca09bc0ad0 . I've checked on bugzilla that there aren't bugs related to "out of memory" issues after 3.2.0 release. [Original Report] ------------------------------------------------------------------- In the last few months my home directory backup stopped completing. I've been able to reproduce the problem on a single subdirectory although I had to add the --debug=all flag to reproduce it on that smaller directory. Specifically, this command never completes: rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ The html2 directory is a copy of gnuradio-3.7.8.1/build/docs/doxygen/html . When I strace the command, I see this: write(1, "sender finished /tmp/html2/atsc_"..., 58sender finished /tmp/html2/atsc__interleaver_8h__incl.md5 ) = 58 write(1, "send_files(338, /tmp/html2/atsc_"..., 59send_files(338, /tmp/html2/atsc__interleaver_8h__incl.png) ) = 59 open("html2/atsc__interleaver_8h__incl.png", O_RDONLY|O_LARGEFILE) = 3 fstat64(3, {st_mode=S_IFREG|0664, st_size=264657, ...}) = 0 write(1, "html2/atsc__interleaver_8h__incl"..., 37html2/atsc__interleaver_8h__incl.png ) = 37 read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\n\253\0\0\2\233\10\6\0\0\0h\242\""..., 262144) = 262144 select(6, [5], [4], [5], {60, 0}) = 2 (in [5], out [4], left {59, 999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) = 95 write(4, "r\311\0\7\177\377\232\237\264\272e\300\300\240\316\264&\314\301\252*\37\256y\225g\373^\315j\370\350"..., 51574) = 51574 select(5, [], [4], [], {60, 0}) = 1 (out [4], left {59, 999997}) write(4, "\7\320\0\7\177\377\234|\7X\223Y\273\255c\27\25f\306\212\202\214#E\272\212t\1\225A\fU"..., 53259) = 53259 select(5, [], [4], [], {60, 0} The select command times out over and over. I get the same behavior when trying to back up my entire home directory but I don't need the --debug=all flag in that case. lsb_release -rd Description: Ubuntu 14.04.3 LTS Release: 14.04 apt-cache policy rsync rsync: Installed: 3.1.0-2ubuntu0.1 Candidate: 3.1.0-2ubuntu0.1 Version table: *** 3.1.0-2ubuntu0.1 0 500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 Packages 500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 Packages 100 /var/lib/dpkg/status 3.1.0-2 0 500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: rsync 3.1.0-2ubuntu0.1 ProcVersionSignature: Ubuntu 3.13.0-74.118-generic 3.13.11-ckt30 Uname: Linux 3.13.0-74-generic i686 NonfreeKernelModules: nvidia ApportVersion: 2.14.1-0ubuntu3.19 Architecture: i386 CurrentDesktop: KDE Date: Wed Dec 23 09:44:17 2015 EcryptfsInUse: Yes InstallationDate: Installed on 2010-09-18 (1922 days ago) InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta i386 (20100901.1) SourcePackage: rsync UpgradeStatus: Upgraded to trusty on 2014-12-27 (361 days ago) ** Description changed: [Impact] What the user suffering from this bug experiences is that the big amount of informative messages related to the copy process with the three spawned processes(sender, receiver and generator) exhaust the I/O buffer and the sync gets stuck, either because there are too many files to synchronise and/or because too many detail messages (levels of verbose mode) have been requested in the output. The fix, that comes from upstream and is applied there since version 3.2.0., increments the size of the receiver's I/O buffer. [Test Plan] This test plan is for Focal, but it's the same for Bionic. 0.Preparing the test environment: #Preparing the container lxc launch images:ubuntu/focal rsync-iobuffer-focal lxc shell rsync-iobuffer-focal apt update -y apt upgrade -y #Installing necessary tools apt install rsync #Get test cases from comments #16 and #19 on this LP bug: As test case #16 covers both aspects (a lot of files and upper verbosity) and test #19 uses a huge tarball (120 Mb), I'm removing from this SRU the #19 - scenario (but, please feel to reach me it if you consider it necessary + scenario (but, please, feel to reach me it if you consider it necessary and I'll provide the steps and bad/good scenarios). cd /tmp/ #16 Paste the contents of https://pastebin.com/raw/ctzJJGwt: #!/bin/bash mkdir source_dir pushd source_dir dd if=/dev/zero of=source bs=600K count=1 for i in `seq 1 11500`; do cp -v source file_$i; done rm source for i in `seq 1 10`; do dd if=/dev/zero of=file_large_$i bs=200M count=1 done popd echo "Created 11500 files with size 600K and 10 files with size 200M, try the following command:" echo "rsync -avvvz --delete source_dir target_dir" in a new file script_comment16.sh chmod +x script_comment16.sh ./script_comment16.sh - 1. Bad cases (without and with using strace): # Scenario from comment 16 $ rsync -avvvz --delete source_dir target_dir sending incremental file list [sender] make_file(source_dir,*,0) send_file_list done [sender] pushing local filters for /root/source_dir/ [sender] make_file(source_dir/file_3048,*,2) [sender] make_file(source_dir/file_11358,*,2) [sender] make_file(source_dir/file_5914,*,2) [sender] make_file(source_dir/file_5880,*,2) [sender] make_file(source_dir/file_9318,*,2) [sender] make_file(source_dir/file_5539,*,2) [...] sending file_sum false_alarms=0 hash_hits=0 matches=0 sender finished source_dir/file_10807 send_files(903, source_dir/file_10808) send_files mapped source_dir/file_10808 of size 614400 calling match_sums source_dir/file_10808 source_dir/file_10808 It hangs here, where using strace we can see: $ strace rsync -avvvz --delete source_dir target_dir source_dir/file_11280 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 90112) = 90112 select(6, [5], [4], [5], {tv_sec=60, tv_usec=0}) = 1 (in [5], left {tv_sec=59, tv_usec=999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 1900) = 1900 select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) select(5, [], [4], [], {tv_sec=60, tv_usec=0}) = 0 (Timeout) select(5, [], [4], [], {tv_sec=60, tv_usec=0} 1. Good cases: # Scenario from comment 16 $ rsync -avvvz --delete source_dir target_dir sending incremental file list [sender] make_file(source_dir,*,0) send_file_list done [sender] pushing local filters for /tmp/source_dir/ [sender] make_file(source_dir/file_3052,*,2) [sender] make_file(source_dir/file_1766,*,2) [sender] make_file(source_dir/file_10466,*,2) [sender] make_file(source_dir/file_9375,*,2) [sender] make_file(source_dir/file_7260,*,2) [sender] make_file(source_dir/file_5554,*,2) [sender] make_file(source_dir/file_5523,*,2) [sender] make_file(source_dir/file_1685,*,2) [sender] make_file(source_dir/file_7217,*,2) [sender] make_file(source_dir/file_10411,*,2) [...] generate_files finished sent 9,555,678 bytes received 3,599,560 bytes 124,694.20 bytes/sec total size is 9,162,752,000 speedup is 696.51 [Where problems could occur] Perhaps the buffer size may not be sufficient for an operation involving a very huge amount of data, as reported upstream here ( https://bugzilla.samba.org/show_bug.cgi?id=11166#c9 ): the reporter asked for a parameter to manually change the size of the receiver's buffer I/O for messaging for those cases, but it may lend itself to a memory overflow more easily even than this solution which also has this risk. [Other Info] The fix has been picked from upstream here https://github.com/WayneD/rsync/commit/01b9bbb0f94e4ddd4b1e3ad678b220ca09bc0ad0 . I've checked on bugzilla that there aren't bugs related to "out of memory" issues after 3.2.0 release. [Original Report] ------------------------------------------------------------------- In the last few months my home directory backup stopped completing. I've been able to reproduce the problem on a single subdirectory although I had to add the --debug=all flag to reproduce it on that smaller directory. Specifically, this command never completes: rsync --debug=all -avz /tmp/html2 /tmp/rsynctest/ The html2 directory is a copy of gnuradio-3.7.8.1/build/docs/doxygen/html . When I strace the command, I see this: write(1, "sender finished /tmp/html2/atsc_"..., 58sender finished /tmp/html2/atsc__interleaver_8h__incl.md5 ) = 58 write(1, "send_files(338, /tmp/html2/atsc_"..., 59send_files(338, /tmp/html2/atsc__interleaver_8h__incl.png) ) = 59 open("html2/atsc__interleaver_8h__incl.png", O_RDONLY|O_LARGEFILE) = 3 fstat64(3, {st_mode=S_IFREG|0664, st_size=264657, ...}) = 0 write(1, "html2/atsc__interleaver_8h__incl"..., 37html2/atsc__interleaver_8h__incl.png ) = 37 read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\n\253\0\0\2\233\10\6\0\0\0h\242\""..., 262144) = 262144 select(6, [5], [4], [5], {60, 0}) = 2 (in [5], out [4], left {59, 999996}) read(5, "\0\0\0\0\0\0\0\1\0\240\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\240\0\0\0"..., 95) = 95 write(4, "r\311\0\7\177\377\232\237\264\272e\300\300\240\316\264&\314\301\252*\37\256y\225g\373^\315j\370\350"..., 51574) = 51574 select(5, [], [4], [], {60, 0}) = 1 (out [4], left {59, 999997}) write(4, "\7\320\0\7\177\377\234|\7X\223Y\273\255c\27\25f\306\212\202\214#E\272\212t\1\225A\fU"..., 53259) = 53259 select(5, [], [4], [], {60, 0} The select command times out over and over. I get the same behavior when trying to back up my entire home directory but I don't need the --debug=all flag in that case. lsb_release -rd Description: Ubuntu 14.04.3 LTS Release: 14.04 apt-cache policy rsync rsync: Installed: 3.1.0-2ubuntu0.1 Candidate: 3.1.0-2ubuntu0.1 Version table: *** 3.1.0-2ubuntu0.1 0 500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main i386 Packages 500 http://security.ubuntu.com/ubuntu/ trusty-security/main i386 Packages 100 /var/lib/dpkg/status 3.1.0-2 0 500 http://us.archive.ubuntu.com/ubuntu/ trusty/main i386 Packages ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: rsync 3.1.0-2ubuntu0.1 ProcVersionSignature: Ubuntu 3.13.0-74.118-generic 3.13.11-ckt30 Uname: Linux 3.13.0-74-generic i686 NonfreeKernelModules: nvidia ApportVersion: 2.14.1-0ubuntu3.19 Architecture: i386 CurrentDesktop: KDE Date: Wed Dec 23 09:44:17 2015 EcryptfsInUse: Yes InstallationDate: Installed on 2010-09-18 (1922 days ago) InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta i386 (20100901.1) SourcePackage: rsync UpgradeStatus: Upgraded to trusty on 2014-12-27 (361 days ago) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1528921 Title: rsync hangs on select(5, [], [4], [], {60, 0} To manage notifications about this bug go to: https://bugs.launchpad.net/rsync/+bug/1528921/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs