Re: [BackupPC-users] Issue with remote backup of server(s) over VPN after failover

2011-02-14 Thread Scott Saunders
Thanks for sharing Stefan.

Unfortunately, I don't think either of those is causing an issue here. 
Both the remote and local sites have T1 connections each with a static 
IP. I recently did an fsck of the backup client's filesystem in question 
(I did it on the master client - any fsck changes would be replicated to 
the slave client via drbd - only one client node, the master in the 
pacemaker cluster, is allowed to have the filesystem mounted at a time) 
- should I also do an fsck of the backup server?

The odd thing here is 5-10 other clients continue to backup without 
errors while this one in particular exhibits what I have previously 
described. It never finishes the first (first here can mean the very 
first or original backup after the server was installed OR the first 
backup taken after backups haven't been taken for a week or two but are 
then resumed) remote backup, and it isn't even stopped by the 
clientTimeout. However, if I physically move the remote backup server 
local and take a full backup of the client in question, subsequent 
remote backups will succeed after moving the remote backup server back 
to the remote location.

Please share any thoughts you might have on what could be done to 
determine the cause here to find a solution. TIA

Scott

On 2/11/2011 1:15 AM, Stefan Peter wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi Scott
>
> Am 11.02.2011 01:29, schrieb Scott Saunders:
>> I let the most recent backup 'finish' on its own. It becomes a partial
>> backup in the host backup summary page with the following error:
>>
>> Read EOF:
>> Tried again: got 0 bytes
>> finish: removing in-process file path/to/filename.ext
>> Can't write 4 bytes to socket
>> Child is aborting
>> Done: 229002 files, 82767774899 bytes
>> Got fatal error during xfer (aborted by signal=PIPE)
>> Backup aborted by user signal
>>
> I had this problem in several times yet. In one case, it was caused by
> the remote ADSL line changing the TCP/IP address during the backup.
> Switching this line to a fixed IP fixed this for me. In all other cases,
> a fsck of the file system in question fixed the issue.
>
> Regards
>
> Stefan Peter
>
> - -- 
> In theory there is no difference between theory and practice. In
> practice there is.
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJNVPoJEBgqi52L7+L/7zwIAJXhvFoN50SUDvNRcMZ2Q7bH
> JHledI1cFIghcqHmN0HHNeTTzbmyIN9dCpG8piieDSnvYRkJotCWJWFP7e7n38sC
> SF1RhOTDorsO63AISE8pNwNhXscHB7I4PE/zCPsrixW/DiLVl8EptkyYX8UHzANZ
> vua2wz2s7BqwJDGFzLc09KoejVU/NYDI3DLkgcYblLAhM1FpKs9dGTicXErjPF0X
> oDijBaa7iXjt5Uf2xjGVJDpS0dJ/SwBElPPn/S5n6HYjwgXDVGfVu4D7l6re7aD6
> hrPJ8qA+JQd6GyqXoFZJdbBR9S0mJHDH+TG3HMeS2OjS6XD+/3+Hxd4GRJVPSGU=
> =NRxB
> -END PGP SIGNATURE-
>
> --
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Issue with remote backup of server(s) over VPN after failover

2011-02-11 Thread Stefan Peter
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Scott

Am 11.02.2011 01:29, schrieb Scott Saunders:
> I let the most recent backup 'finish' on its own. It becomes a partial
> backup in the host backup summary page with the following error:
> 
> Read EOF: 
> Tried again: got 0 bytes
> finish: removing in-process file path/to/filename.ext
> Can't write 4 bytes to socket
> Child is aborting
> Done: 229002 files, 82767774899 bytes
> Got fatal error during xfer (aborted by signal=PIPE)
> Backup aborted by user signal
> 

I had this problem in several times yet. In one case, it was caused by
the remote ADSL line changing the TCP/IP address during the backup.
Switching this line to a fixed IP fixed this for me. In all other cases,
a fsck of the file system in question fixed the issue.

Regards

Stefan Peter

- -- 
In theory there is no difference between theory and practice. In
practice there is.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNVPoJEBgqi52L7+L/7zwIAJXhvFoN50SUDvNRcMZ2Q7bH
JHledI1cFIghcqHmN0HHNeTTzbmyIN9dCpG8piieDSnvYRkJotCWJWFP7e7n38sC
SF1RhOTDorsO63AISE8pNwNhXscHB7I4PE/zCPsrixW/DiLVl8EptkyYX8UHzANZ
vua2wz2s7BqwJDGFzLc09KoejVU/NYDI3DLkgcYblLAhM1FpKs9dGTicXErjPF0X
oDijBaa7iXjt5Uf2xjGVJDpS0dJ/SwBElPPn/S5n6HYjwgXDVGfVu4D7l6re7aD6
hrPJ8qA+JQd6GyqXoFZJdbBR9S0mJHDH+TG3HMeS2OjS6XD+/3+Hxd4GRJVPSGU=
=NRxB
-END PGP SIGNATURE-

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Issue with remote backup of server(s) over VPN after failover

2011-02-10 Thread Scott Saunders
I let the most recent backup 'finish' on its own. It becomes a partial 
backup in the host backup summary page with the following error:


Read EOF:
Tried again: got 0 bytes
finish: removing in-process file path/to/filename.ext
Can't write 4 bytes to socket
Child is aborting
Done: 229002 files, 82767774899 bytes
Got fatal error during xfer (aborted by signal=PIPE)
Backup aborted by user signal

Note, this is well after the default clientTimeout of 72000(secs) and 
the in-process file it specified to be removing is only 114MB so I don't 
think it was due to hanging on a large file.


TypeFilled  Level   Start Date  Duration/mins   Age/days
...





fullyes 0   12/22 20:00 205.4   49.9
fullyes 0   12/29 21:00 136.2   42.8
fullyes 0   1/5 21:00   336.4   35.8
incrno  1   1/10 21:00  0.1 30.8
incrno  1   1/11 22:01  0.1 29.8
partial yes 0   1/28 02:00  17136.1 13.6


Looking a little further in the past, the results of the other node's 
partial backup are a little bit different:


Remote[1]: rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at 
rsync.c(543) [sender=3.0.7]
Can't write 32780 bytes to socket
Read EOF: Connection reset by peer
Tried again: got 0 bytes
finish: removing in-process file path/to/filename.ext
Child is aborting
Done: 32547 files, 30060082211 bytes
Got fatal error during xfer (aborted by signal=PIPE)
Backup aborted by user signal

The file it choked on was only 25MB.

Has nobody else had issues with one of their server's remote backups 
never finishing? The odd thing to me is that if I bring the remote 
backup server local to take a full backup of the server, subsequent 
remote backups of that server succeed (Note, other servers run remote 
backups without these issues). Any help here is appreciated. Maybe I'm 
just overlooking something simple, but I haven't made any progress on 
this issue for some time and I've searched the mailing list for help 
without finding a solution.


Could this possibly be an issue with an older version (we're running 
BackupPC version 3.0.0)? Could this possibly be related to tcp 
segmentation offload (set to 'on' for both backup client and backup 
server)? Could it be compatibility issues between rsync versions? The 
backup servers are running 2.6.9 protocol version 29 and both of the 
clients are running 3.0.7 protocol version 30. AFAIK the newer version 
would be backwards compatible, no? Is this setup confusing -- have I 
explained the issue well enough?


Scott

On 2/7/2011 2:46 PM, Scott Saunders wrote:

I've got a couple of servers running in a 2 node master/slave cluster
using pacemaker(corosync)/drbd. Like other servers, I've got them
configured to backup to a local BackupPC server as well as a remote (VPN
over T1) BackupPC server (rsync over ssh for both). However, with the
cluster, only the master node has the partition mounted that is to be
backed up, so the backups for the slave node will always fail. This is
ok, but maybe there is a better way to do this? Anyway, to get the
backups started I brought the remote backup server local to take a full
backup (because ~300GB). After a fail over of the master node to the
slave node the slave becomes the new master, gets the partition mounted
and thus has something to backup. The local backups work without a
problem on the new master. The remote backups act like they are working
on the new master, but never actually finish. I've let them go more than
a week, which is well past the default client timeout which has actually
never taken effect with these two boxes. This erroneous behavior
persists when failing back over to the original master. The only way I
get the remote backups going again is to bring the remote server local
for a full backup. Any subsequent remote backups work after this until a
fail over of the cluster occurs. Remote backups for other servers in the
past have been performed without these issues. Any ideas as to why there
are issues with the remote backup in this setup? And what I might try to
get the backups running again on the master node after a fail over
without having to bring the remote server local every time?

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] Issue with remote backup of server(s) over VPN after failover

2011-02-07 Thread Scott Saunders
I've got a couple of servers running in a 2 node master/slave cluster 
using pacemaker(corosync)/drbd. Like other servers, I've got them 
configured to backup to a local BackupPC server as well as a remote (VPN 
over T1) BackupPC server (rsync over ssh for both). However, with the 
cluster, only the master node has the partition mounted that is to be 
backed up, so the backups for the slave node will always fail. This is 
ok, but maybe there is a better way to do this? Anyway, to get the 
backups started I brought the remote backup server local to take a full 
backup (because ~300GB). After a fail over of the master node to the 
slave node the slave becomes the new master, gets the partition mounted 
and thus has something to backup. The local backups work without a 
problem on the new master. The remote backups act like they are working 
on the new master, but never actually finish. I've let them go more than 
a week, which is well past the default client timeout which has actually 
never taken effect with these two boxes. This erroneous behavior 
persists when failing back over to the original master. The only way I 
get the remote backups going again is to bring the remote server local 
for a full backup. Any subsequent remote backups work after this until a 
fail over of the cluster occurs. Remote backups for other servers in the 
past have been performed without these issues. Any ideas as to why there 
are issues with the remote backup in this setup? And what I might try to 
get the backups running again on the master node after a fail over 
without having to bring the remote server local every time?

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/