Re: y didn't amanda report this as an error?
Jon LaBadie <[EMAIL PROTECTED]> writes: > On Wed, Sep 24, 2003 at 01:54:49PM -0500, Deb Baddorf wrote: > > From a client machine, the admin sent me this: > > > > Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on signal 11 > > (core dumped) > > > > The above message shows gzip crashed on daesrv last night. It crashed > > because there is a hardware problem on that machine, but since it was > > probably part of an amanda backup that did not work as expected, I wanted > > to be sure amanda had reported something about it to you. -client admin > > > > Amanda herself had reported a strange error in her mail report: > > > > daesrv.fna /usr lev 0 STRANGE > > . > > | DUMP: 33.76% done, finished in 1:20 > > ? sendbackup: index tee cannot write [Broken pipe] > > Note the problem was in making the index, not the backup. But this is not relazed to a gzip. The gzip for the index always runs on the server. Instead this sound like /tmp full, probably on the client. > > | DUMP: Broken pipe > > | DUMP: The ENTIRE dump is aborted. > > ? index returned 1 > > ??error [/sbin/dump returned 3, compress got signal 11]? dumper: strange > > [missing size line from sendbackup] > > ? dumper: strange [missing end line from sendbackup] > > \ And this is a result of the failed gzip; as already mentioned. Sven
Re: y didn't amanda report this as an error?
Phil Homewood wrote: > Now that you mention it, I have had this, a couple of times in the > last week. Am still trying to debug it, but: > > ??error [/bin/tar got signal 13, compress got signal 11]? dumper: strange [missing > size line from sendbackup] Turns out this also appears to be bad hardware, in case anyone's collecting responses. > hammer /home 0 0 1568 --0:08 184.4 0:04 442.5 [left in to show that amanda still considers this a successful dump] -- Phil Homewood, Systems Janitor, http://www.SnapGear.com [EMAIL PROTECTED] Ph: +61 7 3435 2810 Fx: +61 7 3891 3630 SnapGear - Custom Embedded Solutions and Security Appliances
Re: y didn't amanda report this as an error?
At 09:39 AM 9/25/2003 -0400, Jean-Louis Martineau wrote: Hi Deb, Which release of amanda are you using? server is running Amanda-2.4.3 on FreeBSD 4.7-RELEASE-p3 client is running Amanda-2.4.3b4 on FreeBSD 4.8-RELEASE i386 amanda-2.4.4p1 will report a failed dump for this kind error and reschedule a level 0 for the next day. amadmin CONFIG due NODE DISK indicated the next level 0 wasn't scheduled for 7 days yet (I forced one) That was fixed on 2003-04-26. Jean-Louis On Wed, Sep 24, 2003 at 01:54:49PM -0500, Deb Baddorf wrote: > From a client machine, the admin sent me this: > > Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on signal 11 > (core dumped) > > The above message shows gzip crashed on daesrv last night. It crashed > because there is a hardware problem on that machine, but since it was > probably part of an amanda backup that did not work as expected, I wanted > to be sure amanda had reported something about it to you. -client admin > > Amanda herself had reported a strange error in her mail report: > > daesrv.fna /usr lev 0 STRANGE > . > | DUMP: 33.76% done, finished in 1:20 > ? sendbackup: index tee cannot write [Broken pipe] > | DUMP: Broken pipe > | DUMP: The ENTIRE dump is aborted. > ? index returned 1 > ??error [/sbin/dump returned 3, compress got signal 11]? dumper: strange > [missing size line from sendbackup] > ? dumper: strange [missing end line from sendbackup] > \ > > > But it appears that she went ahead and stored the partial data on tape > anyway, and considered this a good level 0 backup. (admin config due > shows the next level 0 is 7 days away) > > daesrv.fnal.gov /usr 0 0 3605024 -- 47:40 1260.7 12:35 4773.9 > > > Why doesn't amanda recognize this as a failure? > Am I missing something that I should have noticed? > Or am I reading it wrong (the fact that "due" implies a level 0 was done)? > Deb Baddorf > --- > Deb Baddorf [EMAIL PROTECTED] 840-2289 > "Nobody told me that living happily ever after would be such hard work ..." > S. White< > > -- Jean-Louis Martineau email: [EMAIL PROTECTED] Departement IRO, Universite de Montreal C.P. 6128, Succ. CENTRE-VILLETel: (514) 343-6111 ext. 3529 Montreal, Canada, H3C 3J7Fax: (514) 343-5834 --- Deb Baddorf [EMAIL PROTECTED] 840-2289 "Nobody told me that living happily ever after would be such hard work ..." S. White<
Re: y didn't amanda report this as an error?
Hi Deb, Which release of amanda are you using? amanda-2.4.4p1 will report a failed dump for this kind error and reschedule a level 0 for the next day. That was fixed on 2003-04-26. Jean-Louis On Wed, Sep 24, 2003 at 01:54:49PM -0500, Deb Baddorf wrote: > From a client machine, the admin sent me this: > > Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on signal 11 > (core dumped) > > The above message shows gzip crashed on daesrv last night. It crashed > because there is a hardware problem on that machine, but since it was > probably part of an amanda backup that did not work as expected, I wanted > to be sure amanda had reported something about it to you. -client admin > > Amanda herself had reported a strange error in her mail report: > > daesrv.fna /usr lev 0 STRANGE > . > | DUMP: 33.76% done, finished in 1:20 > ? sendbackup: index tee cannot write [Broken pipe] > | DUMP: Broken pipe > | DUMP: The ENTIRE dump is aborted. > ? index returned 1 > ??error [/sbin/dump returned 3, compress got signal 11]? dumper: strange > [missing size line from sendbackup] > ? dumper: strange [missing end line from sendbackup] > \ > > > But it appears that she went ahead and stored the partial data on tape > anyway, and considered this a good level 0 backup. (admin config due > shows the next level 0 is 7 days away) > > daesrv.fnal.gov /usr 0 0 3605024 -- 47:40 1260.7 12:35 4773.9 > > > Why doesn't amanda recognize this as a failure? > Am I missing something that I should have noticed? > Or am I reading it wrong (the fact that "due" implies a level 0 was done)? > Deb Baddorf > --- > Deb Baddorf [EMAIL PROTECTED] 840-2289 > "Nobody told me that living happily ever after would be such hard work ..." > S. White< > > -- Jean-Louis Martineau email: [EMAIL PROTECTED] Departement IRO, Universite de Montreal C.P. 6128, Succ. CENTRE-VILLETel: (514) 343-6111 ext. 3529 Montreal, Canada, H3C 3J7Fax: (514) 343-5834
Re: y didn't amanda report this as an error?
Jon LaBadie wrote: > There have been several reports showing failed pipes in > the index stream for various reasons. I wonder if those also > were reported as valid dumps. Now that you mention it, I have had this, a couple of times in the last week. Am still trying to debug it, but: sendbackup: start [hammer:/home level 0] sendbackup: info BACKUP=/bin/tar sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end ? sendbackup: index tee cannot write [Broken pipe] ? index returned 1 ??error [/bin/tar got signal 13, compress got signal 11]? dumper: strange [missing size line from sendbackup] ? dumper: strange [missing end line from sendbackup] [...] hammer /home 0 0 1568 --0:08 184.4 0:04 442.5 The "listed incremental dir" shows: -rw---1 backup backup 0 Sep 23 23:20 hammer_home_0.new The filesystem in question is some 13Gb. Apparently the compress process is SEGVing, but I'm not seeing a core anywhere. Amanda version is 2.4.4-2 (Debian package), server and client are the same machine. Not seeing this on any other boxes, and I have another Debian box with a very similar configuration working fine. -- Phil Homewood, Systems Janitor, http://www.SnapGear.com [EMAIL PROTECTED] Ph: +61 7 3435 2810 Fx: +61 7 3891 3630 SnapGear - Custom Embedded Solutions and Security Appliances
Re: y didn't amanda report this as an error?
On Wed, Sep 24, 2003 at 05:30:59PM -0500, Deb Baddorf wrote: > At 03:36 PM 9/24/2003 -0400, Jon LaBadie wrote: > >On Wed, Sep 24, 2003 at 01:54:49PM -0500, Deb Baddorf wrote: > >> From a client machine, the admin sent me this: > >> > >> Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on > >signal 11 > >> (core dumped) > >> > >> The above message shows gzip crashed on daesrv last night. It crashed > >> because there is a hardware problem on that machine, but since it was > >> probably part of an amanda backup that did not work as expected, I wanted > >> to be sure amanda had reported something about it to you. -client admin > >> > >> Amanda herself had reported a strange error in her mail report: > >> > >> daesrv.fna /usr lev 0 STRANGE > >> . > >> | DUMP: 33.76% done, finished in 1:20 > >> ? sendbackup: index tee cannot write [Broken pipe] > > > >Note the problem was in making the index, not the backup. > > Wel but the client was doing it's own compressing. So when the > gzipper failed, the whole backup failed. At only 33% finished. > I just did a test amrestore (true, amrecover wouldn't touch it). > Got about 1/3 the amount of data that ought to be on that disk. > So I think it really did fail, but registered it as a successful level 0 > backup. :-( Certainly sounds like a situation that amanda should not have recorded as a valid level 0. Has anyone else noted this? There have been several reports showing failed pipes in the index stream for various reasons. I wonder if those also were reported as valid dumps. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: y didn't amanda report this as an error?
At 03:36 PM 9/24/2003 -0400, Jon LaBadie wrote: On Wed, Sep 24, 2003 at 01:54:49PM -0500, Deb Baddorf wrote: > From a client machine, the admin sent me this: > > Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on signal 11 > (core dumped) > > The above message shows gzip crashed on daesrv last night. It crashed > because there is a hardware problem on that machine, but since it was > probably part of an amanda backup that did not work as expected, I wanted > to be sure amanda had reported something about it to you. -client admin > > Amanda herself had reported a strange error in her mail report: > > daesrv.fna /usr lev 0 STRANGE > . > | DUMP: 33.76% done, finished in 1:20 > ? sendbackup: index tee cannot write [Broken pipe] Note the problem was in making the index, not the backup. Wel but the client was doing it's own compressing. So when the gzipper failed, the whole backup failed. At only 33% finished. I just did a test amrestore (true, amrecover wouldn't touch it). Got about 1/3 the amount of data that ought to be on that disk. So I think it really did fail, but registered it as a successful level 0 backup. :-( > | DUMP: Broken pipe > | DUMP: The ENTIRE dump is aborted. > ? index returned 1 > ??error [/sbin/dump returned 3, compress got signal 11]? dumper: strange > [missing size line from sendbackup] > ? dumper: strange [missing end line from sendbackup] > \ > > > But it appears that she went ahead and stored the partial data on tape > anyway, and considered this a good level 0 backup. (admin config due > shows the next level 0 is 7 days away) > > daesrv.fnal.gov /usr 0 0 3605024 -- 47:40 1260.7 12:35 4773.9 > > Why doesn't amanda recognize this as a failure? > Am I missing something that I should have noticed? > Or am I reading it wrong (the fact that "due" implies a level 0 was done)? Did your report show it was "taped". If so I suspect the backup is ok, but using amrecover with the index will be suspect/problematical. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
Re: y didn't amanda report this as an error?
On Wed, Sep 24, 2003 at 01:54:49PM -0500, Deb Baddorf wrote: > From a client machine, the admin sent me this: > > Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on signal 11 > (core dumped) > > The above message shows gzip crashed on daesrv last night. It crashed > because there is a hardware problem on that machine, but since it was > probably part of an amanda backup that did not work as expected, I wanted > to be sure amanda had reported something about it to you. -client admin > > Amanda herself had reported a strange error in her mail report: > > daesrv.fna /usr lev 0 STRANGE > . > | DUMP: 33.76% done, finished in 1:20 > ? sendbackup: index tee cannot write [Broken pipe] Note the problem was in making the index, not the backup. > | DUMP: Broken pipe > | DUMP: The ENTIRE dump is aborted. > ? index returned 1 > ??error [/sbin/dump returned 3, compress got signal 11]? dumper: strange > [missing size line from sendbackup] > ? dumper: strange [missing end line from sendbackup] > \ > > > But it appears that she went ahead and stored the partial data on tape > anyway, and considered this a good level 0 backup. (admin config due > shows the next level 0 is 7 days away) > > daesrv.fnal.gov /usr 0 0 3605024 -- 47:40 1260.7 12:35 4773.9 > > Why doesn't amanda recognize this as a failure? > Am I missing something that I should have noticed? > Or am I reading it wrong (the fact that "due" implies a level 0 was done)? Did your report show it was "taped". If so I suspect the backup is ok, but using amrecover with the index will be suspect/problematical. -- Jon H. LaBadie [EMAIL PROTECTED] JG Computing 4455 Province Line Road(609) 252-0159 Princeton, NJ 08540-4322 (609) 683-7220 (fax)
y didn't amanda report this as an error?
From a client machine, the admin sent me this: Sep 24 02:45:32 daesrv /kernel: pid 7638 (gzip), uid 2: exited on signal 11 (core dumped) The above message shows gzip crashed on daesrv last night. It crashed because there is a hardware problem on that machine, but since it was probably part of an amanda backup that did not work as expected, I wanted to be sure amanda had reported something about it to you. -client admin Amanda herself had reported a strange error in her mail report: daesrv.fna /usr lev 0 STRANGE . | DUMP: 33.76% done, finished in 1:20 ? sendbackup: index tee cannot write [Broken pipe] | DUMP: Broken pipe | DUMP: The ENTIRE dump is aborted. ? index returned 1 ??error [/sbin/dump returned 3, compress got signal 11]? dumper: strange [missing size line from sendbackup] ? dumper: strange [missing end line from sendbackup] \ But it appears that she went ahead and stored the partial data on tape anyway, and considered this a good level 0 backup. (admin config due shows the next level 0 is 7 days away) daesrv.fnal.gov /usr 0 0 3605024 -- 47:40 1260.7 12:35 4773.9 Why doesn't amanda recognize this as a failure? Am I missing something that I should have noticed? Or am I reading it wrong (the fact that "due" implies a level 0 was done)? Deb Baddorf --- Deb Baddorf [EMAIL PROTECTED] 840-2289 "Nobody told me that living happily ever after would be such hard work ..." S. White<