[BackupPC-users] BackupPC 3.1.0 failing

2008-03-10 Thread Steen Eugen Poulsen
I've been running 3.1.0 since it was created and I finished setting 
things up a while ago and it ran for months rock solid, but then one day 
some machines stopped working.


One Gentoo vserver is running 4 of the machines. Gentoo host OS, Ubuntu, 
Debian and a Gentoo. BackupPC manages to backup the Ubuntu, but not the 
host OS or the two other vservers.


It also fails on a remote internet server running Gentoo, but not the 
other server that runs Ubuntu.


The Debian vserver log file looks like this and that seems to be the 
same error for all of the failing machines:


full backup started for directory /; updating partial #117
Running: /usr/bin/ssh -q -x -l root liferaft vserver debian exec 
/usr/bin/rsync --server --sender --numeric-ids --perms --owner --group 
-D --links --hard-links --times --block-size=2048 --recursive 
--checksum-seed=32761 --ignore-times . /

Xfer PIDs are now 23937
Got remote protocol 30
Negotiated protocol version 28
Checksum caching enabled (checksumSeed = 32761)
Sent exclude: /dev
Sent exclude: /exports
Sent exclude: /home
Sent exclude: /media
Sent exclude: /mnt
Sent exclude: /proc
Sent exclude: /pub
Sent exclude: /srv
Sent exclude: /sys
Sent exclude: /tmp
Sent exclude: /usr/portage
Sent exclude: /var/lock
Sent exclude: /var/run
Sent exclude: /var/tmp
Xfer PIDs are now 23937,23949
  create d 755   0/04096 .
  create d 755   0/04096 bin
  pool 755   0/0  688492 bin/bash
  pool 755   0/0   25216 bin/bunzip2
  pool 755   0/0   25216 bin/bzcat - bin/bunzip2
  pool   l 777   0/0   6 bin/bzcmp - bzdiff
  pool 755   0/02128 bin/bzdiff
  pool   l 777   0/0   6 bin/bzegrep - bzgrep
  pool 755   0/04874 bin/bzexe
  pool   l 777   0/0   6 bin/bzfgrep - bzgrep
  pool 755   0/03642 bin/bzgrep
  pool 755   0/0   25216 bin/bzip2 - bin/bunzip2
  pool 755   0/08064 bin/bzip2recover
  pool   l 777   0/0   6 bin/bzless - bzmore
  pool 755   0/01297 bin/bzmore
  pool 755   0/0   26860 bin/cat
  pool 755   0/0   45344 bin/chgrp
  pool 755   0/0   42744 bin/chmod
  pool 755   0/0   47356 bin/chown
  pool 755   0/0   69284 bin/cp
  pool 755   0/0   55052 bin/date
  pool 755   0/0   47852 bin/dd
  pool 755   0/0   45016 bin/df
  pool 755   0/0   92312 bin/dir
  pool 755   0/04428 bin/dmesg
  pool 755   0/08592 bin/dnsdomainname
  pool 755   0/0   24228 bin/echo
  pool 755   0/0   92436 bin/egrep
  pool 755   0/0   22120 bin/false
  pool 755   0/0   52880 bin/fgrep
  pool 755   0/0  100468 bin/grep
  same 755   0/0  61 bin/gunzip
  same 755   0/05864 bin/gzexe
  pool 755   0/0   53420 bin/gzip
  pool 755   0/08592 bin/hostname
  pool 755   0/0   12348 bin/kill
Read EOF:
Tried again: got 0 bytes
finish: removing in-process file bin/ln
Child is aborting
Parent read EOF from child: fatal error!
Done: 34 files, 1591016 bytes
Got fatal error during xfer (Child exited prematurely)
Backup aborted (Child exited prematurely)



This one died quick, but it's completely random at what point it fails 
on the machines. As protocol 30 shows, I've upgraded rsync to 3.0.0 to 
see if I had a bad rsync 2.6.x (All the dist has upgraded to the same 
one) that for some reaosn only Ubuntu had fixed, but it still fails.


It sometimes aborts on a singal PIPE on some of the machines and some 
logs has a can't write to socket error.


The total randomness of what works and whats broken has me completely 
puzzled.


smime.p7s
Description: S/MIME Cryptographic Signature
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC 3.1.0 failing

2008-03-10 Thread Steen Eugen Poulsen
incr backup started back to 2008-03-01 10:11:36 (backup #158) for 
directory /
Running: /usr/bin/ssh -q -x -l root dragonslair /usr/bin/rsync --server 
--sender --numeric-ids --perms --owner --group -D --links --hard-links 
--times --block-size=2048 --recursive --checksum-seed=32761 . /

Xfer PIDs are now 24098
Got remote protocol 29
Negotiated protocol version 28
Checksum caching enabled (checksumSeed = 32761)
Sent exclude: /dev
Sent exclude: /media
Sent exclude: /mnt
Sent exclude: /proc
Sent exclude: /pub
Sent exclude: /srv
Sent exclude: /sys
Sent exclude: /tmp
Sent exclude: /usr/portage
Sent exclude: /var/run
Sent exclude: /var/lock
Sent exclude: /var/tmp
Xfer PIDs are now 24098,24099
[ skipped 1865 lines ]
Unexpected call BackupPC::Xfer::RsyncFileIO-unlink(usr/lib/libslang.a)
[ skipped 1270 lines ]
Can't write 32772 bytes to socket
[ skipped 10 lines ]
Done: 0 files, 0 bytes
Got fatal error during xfer (aborted by signal=PIPE)
Backup aborted by user signal

The signal=PIPE failure I get on two machines.


Host OS and remote Gentoo server both does the signal=PIPE abort.

vserver OS Gentoo and Debian both gives the error in the first messages. 
(But the Ubuntu vserver backup fine)




smime.p7s
Description: S/MIME Cryptographic Signature
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/