On 17/7/19 4:22 am, David Koski wrote:
Regards,
David Koski
dko...@sutinen.com
On 7/8/19 6:16 PM, Adam Goryachev wrote:
On 9/7/19 10:23 am, David Koski wrote:
I am trying to back up about 24TB of data that has millions of
files. It takes a day or to before it starts backing up and then
stops with an error. I did a CLI dump and trapped the output and
can see the error message:
Can't write 32780 bytes to socket
Read EOF: Connection reset by peer
Tried again: got 0 bytes
finish: removing in-process file
Shares/Archives/<path-removed>/COR_2630.png
Child is aborting
Done: 589666 files, 1667429241846 bytes
Got fatal error during xfer (aborted by signal=PIPE)
Backup aborted by user signal
Not saving this as a partial backup since it has fewer files than
the prior one (got 589666 and 589666 files versus 4225016)
dump failed: aborted by signal=PIPE
This backup is doing rsync over ssh. I enabled SSH keepalive but it
does not appear to be due to an idle network. It does not appear to
be a random network interruption because the time it takes to fail
is pretty consistent, about three days. I'm stumped.
Did you check:
$Conf{ClientTimeout} = 72000;
Also, what version of rsync on the client, what version of BackupPC
on the server, etc?
I think BPC v4 handles this scenario significantly better, in fact a
server I used to have trouble with on BPC3.x all the time has since
been combined with 4 other server (so 4 x the number of files and
total size of data) and BPC4 handles it easily.
Thank you all for your input. More information:
rsync version on client: 3.0.8 (Windows)
rsync version on server: 3.1.2 (Debian)
BackupPC version: 3.3.1
$(Config{ClientTimeout} = 604800
I just compared the output of two verbose BackupPC_dump runs and it
looks like the files are reported to be backed up even though they are
not. For example, this appears in logs of both backup runs:
create 644 4616/545 1085243184 <path-removed>/<name-removed>3412.zip
I checked and the file time stamp is year 2018. The log files are
full of these. I checked the real time clock on both systems and they
are correct. There are also files that have been backed up that are
not in the logs.
I suspect there are over ten million files but I don't have a good way
of telling now. Oddly, there are about 500,000 files backed according
to the log captured from BackupPC_dump and almost the same number
actually backed up and found in pc/<host>/0, but they are different
subsets of files. I have been tracking memory and swap usage on the
server and see no issues.
Is this a possible bug in BackupPC 3.3.1?
Please don't top-post if you can avoid it, at least not on mailing lists.
I just realised:
Read EOF: Connection reset by peer
This is a networking issue, not BackupPC. In other words, something has
broken the network connection (in the middle of transferring a file, so
I would presume it isn't due to some idle timeout, dropped NAT entry,
etc). BackupPC has been told by the operating system that the connection
is no longer valid, and so it has "cleaned up" by removing the
in-progress file (partial).
It takes a day to start (presumably reading ALL the files on the client
takes this long, you could improve disk performance, or increase RAM on
the client to improve this).
"and then stops with an error" - is that on the first file, or are some
files successfully transferred? Is that the first large file? Does it
always fail on the same file (seems not, since it previously got many more).
I'm thinking you need to check and/or improve network reliability, make
sure both client and server are not running out of RAM/etc (mainly the
backuppc client, the OOM might kill the rsync process), etc. Check your
system logs on both client and server, and/or watch top output on both
systems during the backup.
Try backing up other systems, try backing up a smaller subset (exclude
some large directories, and then add them back in if you complete a
backup successfully).
Overall, I would advise to upgrade to BPC v4.x, it handles backups of
systems with huge number of files much better.
This doesn't look like a BPC bug, maybe a network driver, kernel, or
something else, but not BPC (IMHO).
Regards,
Adam
--
Adam Goryachev Website Managers www.websitemanagers.com.au
--
The information in this e-mail is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this e-mail by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you have received this message
in error, please notify us immediately. Please also destroy and delete the
message from your computer. Viruses - Any loss/damage incurred by receiving
this email is not the sender's responsibility.
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/