Re: dumper5 pid 10813 is messed up, ignoring it

2005-06-17 Thread Paul Bijnens

Rebecca Pakish Crum wrote:

on 16.06.2005, 19:47 you wrote to amanda-users@amanda.org:

Here's what I have: A server running amanda-2.4.4p4 running on RHE3 
intel, and has been for a couple of years. A client running 
amanda-2.4.4p1 on Sol9, sparc.



Yeah, I thought about that, but if that were the problem, wouldn't it be
happening on my other 2.4.4p1 clients? I'm looking for a little bit more
information on what the problem could possibly be. What causes the
dumper to tank all of a sudden? If there's something else failing on
this box, I'd kind of like to know.


It seems you get this message when a dumper decides to send garbage
to the driver (instead of a limited set of commands), or quits
suddenly.  The last one is most probable.  If it dies, there could
be many reasons, some even hardware related, e.g. bad RAM, or software
related (e.g. out of open file descripters,  out of swap space, etc).

The fact that it only happend twice on the same host has maybe more to
do with the fact that is was trying to dump a level 0 dump, taking much
longer, and having a greater chance of being killed.

Notice that it is the dumper on the amandaserver that fouled up, not the
client sending the dumps.


--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***




RE: dumper5 pid 10813 is messed up, ignoring it

2005-06-16 Thread Rebecca Pakish Crum
> Hello, Rebecca,
> 
> on 16.06.2005, 19:47 you wrote to amanda-users@amanda.org:
> 
> > Here's what I have: A server running amanda-2.4.4p4 running on RHE3 
> > intel, and has been for a couple of years. A client running 
> > amanda-2.4.4p1 on Sol9, sparc.
> 
> > Any suggestions?
> 
> Is there any reasonable chance of getting some more recent 
> version of AMANDA running on that client?
> 
> 2.4.4p1 is pretty old now, there have been loads of changes 
> and fixes since then ...
> 
> Rolling your new and shiny AMANDA-client shouldn't take you 
> longer than debugging the old one with gdb ...
> 
> Best regards,
> Stefan G. Weichinger.


Yeah, I thought about that, but if that were the problem, wouldn't it be
happening on my other 2.4.4p1 clients? I'm looking for a little bit more
information on what the problem could possibly be. What causes the
dumper to tank all of a sudden? If there's something else failing on
this box, I'd kind of like to know.



Re: dumper5 pid 10813 is messed up, ignoring it

2005-06-16 Thread sgw

Hello, Rebecca,

on 16.06.2005, 19:47 you wrote to amanda-users@amanda.org:

> Here's what I have: A server running amanda-2.4.4p4 running on RHE3
> intel, and has been for a couple of years. A client running
> amanda-2.4.4p1 on Sol9, sparc.

> Any suggestions?

Is there any reasonable chance of getting some more recent version of
AMANDA running on that client?

2.4.4p1 is pretty old now, there have been loads of changes and fixes
since then ...

Rolling your new and shiny AMANDA-client shouldn't take you longer
than debugging the old one with gdb ...

Best regards,
Stefan G. Weichinger.

mailto://[EMAIL PROTECTED]







dumper5 pid 10813 is messed up, ignoring it

2005-06-16 Thread Rebecca Pakish Crum
Title: dumper5 pid 10813 is messed up, ignoring it






Hi all -


I've searched the archives and did a google search on the above error, and I found an old sourceforge archive from JJ that offered some suggestions, but I still can't get a handle on what's going on.

Here's what I have: A server running amanda-2.4.4p4 running on RHE3 intel, and has been for a couple of years. A client running amanda-2.4.4p1 on Sol9, sparc.

Here's what's happening: I've isolated the problem to this client because it's the only one where all of the results are missing, well almost all. I changed my disklist to include only this client since all others are backing up just fine. Basically the job just hangs and hangs and hangs, so I come in and manually kill dumper0-dumper5 and then the taper pid's go away, too…then I finally get the email from amanda:

*** THE DUMPS DID NOT FINISH PROPERLY!


Ignore this - I didn't have a tape in the drive at the time I was trying this test…same with taper stats below*** A TAPE ERROR OCCURRED: [rewinding tape: Input/output error]. Some dumps may have been left in the holding disk. Run amflush to flush them to tape. The next tape Amanda expects to use is: uadaily02. 

FAILURE AND STRANGE DUMP SUMMARY:

  smores.unt /usr RESULTS MISSING

  smores.unt /opt RESULTS MISSING

  smores.unt /export/home RESULTS MISSING

  smores.unt /var RESULTS MISSING



STATISTICS:

  Total   Full  Daily

          

Estimate Time (hrs:min)    0:01

Run Time (hrs:min) 0:01

Dump Time (hrs:min)    0:01   0:01   0:00

Output Size (meg)   2.3    2.3    0.0

Original Size (meg) 2.3    2.3    0.0

Avg Compressed Size (%) -- -- -- 

Filesystems Dumped    1  1  0

Avg Dump Rate (k/s)    45.0   45.0    -- 


Tape Time (hrs:min)    0:00   0:00   0:00

Tape Size (meg) 0.0    0.0    0.0

Tape Used (%)   0.0    0.0    0.0

Filesystems Taped 0  0  0

Avg Tp Write Rate (k/s)     --         --         -- 



NOTES:

  driver: dumper5 pid 10813 is messed up, ignoring it.

  driver: dumper0 pid 10808 is messed up, ignoring it.

  driver: dumper0 died while dumping smores.unterlaw.com:/export/home lev 0.



DUMP SUMMARY:

  DUMPER STATS    TAPER STATS  

HOSTNAME DISK  L   ORIG-KB   OUT-KB COMP% MMM:SS    KB/s MMM:SS    KB/s

 --- --

smores.u /etc  0  2360 2360   --    0:52    45.0   N/A N/A 

smores.u /export/home    MISSING --

smores.u /opt    MISSING --

smores.u /usr    MISSING --

smores.u /var    MISSING --


(brought to you by Amanda version 2.4.4p4)


According the email a couple of years ago from JJ, I tried to run the amdump on just this client, while running gdb in another window on the dumper pid. I've never used gdb before, but basically it just gave me a bunch of library connection stuff and then sat there after I typed "cont" at the  prompt.

Any suggestions?