Re: [BackupPC-users] Issue with remote backup of server(s) over VPN after failover
Thanks for sharing Stefan. Unfortunately, I don't think either of those is causing an issue here. Both the remote and local sites have T1 connections each with a static IP. I recently did an fsck of the backup client's filesystem in question (I did it on the master client - any fsck changes would be replicated to the slave client via drbd - only one client node, the master in the pacemaker cluster, is allowed to have the filesystem mounted at a time) - should I also do an fsck of the backup server? The odd thing here is 5-10 other clients continue to backup without errors while this one in particular exhibits what I have previously described. It never finishes the first (first here can mean the very first or original backup after the server was installed OR the first backup taken after backups haven't been taken for a week or two but are then resumed) remote backup, and it isn't even stopped by the clientTimeout. However, if I physically move the remote backup server local and take a full backup of the client in question, subsequent remote backups will succeed after moving the remote backup server back to the remote location. Please share any thoughts you might have on what could be done to determine the cause here to find a solution. TIA Scott On 2/11/2011 1:15 AM, Stefan Peter wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi Scott > > Am 11.02.2011 01:29, schrieb Scott Saunders: >> I let the most recent backup 'finish' on its own. It becomes a partial >> backup in the host backup summary page with the following error: >> >> Read EOF: >> Tried again: got 0 bytes >> finish: removing in-process file path/to/filename.ext >> Can't write 4 bytes to socket >> Child is aborting >> Done: 229002 files, 82767774899 bytes >> Got fatal error during xfer (aborted by signal=PIPE) >> Backup aborted by user signal >> > I had this problem in several times yet. In one case, it was caused by > the remote ADSL line changing the TCP/IP address during the backup. > Switching this line to a fixed IP fixed this for me. In all other cases, > a fsck of the file system in question fixed the issue. > > Regards > > Stefan Peter > > - -- > In theory there is no difference between theory and practice. In > practice there is. > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJNVPoJEBgqi52L7+L/7zwIAJXhvFoN50SUDvNRcMZ2Q7bH > JHledI1cFIghcqHmN0HHNeTTzbmyIN9dCpG8piieDSnvYRkJotCWJWFP7e7n38sC > SF1RhOTDorsO63AISE8pNwNhXscHB7I4PE/zCPsrixW/DiLVl8EptkyYX8UHzANZ > vua2wz2s7BqwJDGFzLc09KoejVU/NYDI3DLkgcYblLAhM1FpKs9dGTicXErjPF0X > oDijBaa7iXjt5Uf2xjGVJDpS0dJ/SwBElPPn/S5n6HYjwgXDVGfVu4D7l6re7aD6 > hrPJ8qA+JQd6GyqXoFZJdbBR9S0mJHDH+TG3HMeS2OjS6XD+/3+Hxd4GRJVPSGU= > =NRxB > -END PGP SIGNATURE- > > -- > The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: > Pinpoint memory and threading errors before they happen. > Find and fix more than 250 security defects in the development cycle. > Locate bottlenecks in serial and parallel code that limit performance. > http://p.sf.net/sfu/intel-dev2devfeb > ___ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List:https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki:http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Issue with remote backup of server(s) over VPN after failover
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Scott Am 11.02.2011 01:29, schrieb Scott Saunders: > I let the most recent backup 'finish' on its own. It becomes a partial > backup in the host backup summary page with the following error: > > Read EOF: > Tried again: got 0 bytes > finish: removing in-process file path/to/filename.ext > Can't write 4 bytes to socket > Child is aborting > Done: 229002 files, 82767774899 bytes > Got fatal error during xfer (aborted by signal=PIPE) > Backup aborted by user signal > I had this problem in several times yet. In one case, it was caused by the remote ADSL line changing the TCP/IP address during the backup. Switching this line to a fixed IP fixed this for me. In all other cases, a fsck of the file system in question fixed the issue. Regards Stefan Peter - -- In theory there is no difference between theory and practice. In practice there is. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNVPoJEBgqi52L7+L/7zwIAJXhvFoN50SUDvNRcMZ2Q7bH JHledI1cFIghcqHmN0HHNeTTzbmyIN9dCpG8piieDSnvYRkJotCWJWFP7e7n38sC SF1RhOTDorsO63AISE8pNwNhXscHB7I4PE/zCPsrixW/DiLVl8EptkyYX8UHzANZ vua2wz2s7BqwJDGFzLc09KoejVU/NYDI3DLkgcYblLAhM1FpKs9dGTicXErjPF0X oDijBaa7iXjt5Uf2xjGVJDpS0dJ/SwBElPPn/S5n6HYjwgXDVGfVu4D7l6re7aD6 hrPJ8qA+JQd6GyqXoFZJdbBR9S0mJHDH+TG3HMeS2OjS6XD+/3+Hxd4GRJVPSGU= =NRxB -END PGP SIGNATURE- -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Issue with remote backup of server(s) over VPN after failover
I let the most recent backup 'finish' on its own. It becomes a partial backup in the host backup summary page with the following error: Read EOF: Tried again: got 0 bytes finish: removing in-process file path/to/filename.ext Can't write 4 bytes to socket Child is aborting Done: 229002 files, 82767774899 bytes Got fatal error during xfer (aborted by signal=PIPE) Backup aborted by user signal Note, this is well after the default clientTimeout of 72000(secs) and the in-process file it specified to be removing is only 114MB so I don't think it was due to hanging on a large file. TypeFilled Level Start Date Duration/mins Age/days ... fullyes 0 12/22 20:00 205.4 49.9 fullyes 0 12/29 21:00 136.2 42.8 fullyes 0 1/5 21:00 336.4 35.8 incrno 1 1/10 21:00 0.1 30.8 incrno 1 1/11 22:01 0.1 29.8 partial yes 0 1/28 02:00 17136.1 13.6 Looking a little further in the past, the results of the other node's partial backup are a little bit different: Remote[1]: rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(543) [sender=3.0.7] Can't write 32780 bytes to socket Read EOF: Connection reset by peer Tried again: got 0 bytes finish: removing in-process file path/to/filename.ext Child is aborting Done: 32547 files, 30060082211 bytes Got fatal error during xfer (aborted by signal=PIPE) Backup aborted by user signal The file it choked on was only 25MB. Has nobody else had issues with one of their server's remote backups never finishing? The odd thing to me is that if I bring the remote backup server local to take a full backup of the server, subsequent remote backups of that server succeed (Note, other servers run remote backups without these issues). Any help here is appreciated. Maybe I'm just overlooking something simple, but I haven't made any progress on this issue for some time and I've searched the mailing list for help without finding a solution. Could this possibly be an issue with an older version (we're running BackupPC version 3.0.0)? Could this possibly be related to tcp segmentation offload (set to 'on' for both backup client and backup server)? Could it be compatibility issues between rsync versions? The backup servers are running 2.6.9 protocol version 29 and both of the clients are running 3.0.7 protocol version 30. AFAIK the newer version would be backwards compatible, no? Is this setup confusing -- have I explained the issue well enough? Scott On 2/7/2011 2:46 PM, Scott Saunders wrote: I've got a couple of servers running in a 2 node master/slave cluster using pacemaker(corosync)/drbd. Like other servers, I've got them configured to backup to a local BackupPC server as well as a remote (VPN over T1) BackupPC server (rsync over ssh for both). However, with the cluster, only the master node has the partition mounted that is to be backed up, so the backups for the slave node will always fail. This is ok, but maybe there is a better way to do this? Anyway, to get the backups started I brought the remote backup server local to take a full backup (because ~300GB). After a fail over of the master node to the slave node the slave becomes the new master, gets the partition mounted and thus has something to backup. The local backups work without a problem on the new master. The remote backups act like they are working on the new master, but never actually finish. I've let them go more than a week, which is well past the default client timeout which has actually never taken effect with these two boxes. This erroneous behavior persists when failing back over to the original master. The only way I get the remote backups going again is to bring the remote server local for a full backup. Any subsequent remote backups work after this until a fail over of the cluster occurs. Remote backups for other servers in the past have been performed without these issues. Any ideas as to why there are issues with the remote backup in this setup? And what I might try to get the backups running again on the master node after a fail over without having to bring the remote server local every time? -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
[BackupPC-users] Issue with remote backup of server(s) over VPN after failover
I've got a couple of servers running in a 2 node master/slave cluster using pacemaker(corosync)/drbd. Like other servers, I've got them configured to backup to a local BackupPC server as well as a remote (VPN over T1) BackupPC server (rsync over ssh for both). However, with the cluster, only the master node has the partition mounted that is to be backed up, so the backups for the slave node will always fail. This is ok, but maybe there is a better way to do this? Anyway, to get the backups started I brought the remote backup server local to take a full backup (because ~300GB). After a fail over of the master node to the slave node the slave becomes the new master, gets the partition mounted and thus has something to backup. The local backups work without a problem on the new master. The remote backups act like they are working on the new master, but never actually finish. I've let them go more than a week, which is well past the default client timeout which has actually never taken effect with these two boxes. This erroneous behavior persists when failing back over to the original master. The only way I get the remote backups going again is to bring the remote server local for a full backup. Any subsequent remote backups work after this until a fail over of the cluster occurs. Remote backups for other servers in the past have been performed without these issues. Any ideas as to why there are issues with the remote backup in this setup? And what I might try to get the backups running again on the master node after a fail over without having to bring the remote server local every time? -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/