Re: Multi-tape span failure

2007-10-31 Thread Tom Hansen

Jon LaBadie wrote:

On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote:
  
BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, 
configured to backup several large (300Gb +) filesystems spanning 
several tapes.  I have a robot changer, LTO1 tapes (100Gb capacity) and 
I used:


   tape_splitsize 3Gb
   fallback_splitsize 256m

[ stuff deleted ]
MY QUESTION:  Is there any way to configure Amanda such that such a tape 
error would simply go to the next tape, instead of the worst possible 
action, which is to abort the whole job?


Short of that, is there any way Amanda could start up from where it left 
off?





Short answer - no.  If the backups are in a holding disk they can
still be flushed to tapes, but resume a backup no.


Something in your report is amiss.  If amanda had successfully
used 6 tapes, it would have completed backing up and taping
one or more of your 300GB DLE's.  There is no reason a failed
tape after that would invalidate those backups.  And your
report (emailed or available with amreport) would show that.
  


Following is the report.  It clearly says "FAILED" for all 4 filesystems 
under "FAILURE AND STRANGE DUMP SUMMARY" and sure enough, I could not 
see any files using "amrecover". (I have done a test using one small 
filesystem, and amrecover did work in that case, so I'm pretty confident 
that my setup is good.)


I did just notice that, at the very bottom, it does not indicate failure 
for the two filesystems that were complete.  I'm not sure what to make 
of that.


Thanks for your comments.  (Oh and BTW, I was totally wrong about the 
dump time, it was more like 20 hours)


-Tom





Hostname: waterbase
Org : GLWI
Config  : fullback
Date: October 29, 2007

These dumps were to tapes GLWIBACK-001, GLWIBACK-002, GLWIBACK-003, 
GLWIBACK-004, GLWIBACK-005, GLWIBACK-006.

*** A TAPE ERROR OCCURRED: [No more writable valid tape found].
Some dumps may have been left in the holding disk.
Run amflush to flush them to tape.
The next 9 tapes Amanda expects to use are: 9 new tapes.

FAILURE AND STRANGE DUMP SUMMARY:
 waterbase.uwm.edu  /media/raid2  lev 0  FAILED [out of tape]
 waterbase.uwm.edu  /media/raid2  lev 0  FAILED [data write: Broken pipe]
 waterbase.uwm.edu  / lev 0  FAILED [can't switch to 
incremental dump]

 waterbase.uwm.edu  /media/raid2  lev 0  FAILED [dump to tape failed]


STATISTICS:
 Total   Full  Incr.
         
Estimate Time (hrs:min)1:00
Run Time (hrs:min)20:06
Dump Time (hrs:min)   16:25  16:25   0:00
Output Size (meg)  690435.5   690435.50.0
Original Size (meg)690351.3   690351.30.0
Avg Compressed Size (%) -- -- --
Filesystems Dumped2  2  0
Avg Dump Rate (k/s) 11966.411966.4--

Tape Time (hrs:min)   16:14  16:14   0:00
Tape Size (meg)690435.5   690435.50.0
Tape Used (%) 665.3  665.30.0
Filesystems Taped 2  2  0

Chunks Taped   3121   3121  0
Avg Tp Write Rate (k/s) 12093.412093.4--

USAGE BY TAPE:
 Label  Time  Size  %NbNc
 GLWIBACK-001   3:01 130531776K  122.8 0   498
 GLWIBACK-002   3:10 135774016K  127.7 0   518
 GLWIBACK-003   3:01 123874432K  116.5 1   473
 GLWIBACK-004   3:05 143113152K  134.6 0   546
 GLWIBACK-005   2:56 124765312K  117.4 0   476
 GLWIBACK-006   3:38 159734400K  150.3 1   610


FAILED AND STRANGE DUMP DETAILS:

/--  waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe]
sendbackup: start [waterbase.uwm.edu:/media/raid2 level 0]
sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/bin/tar -xpGf - ...
sendbackup: info end
| gtar: ./mysql_trans/mysql.sock: socket ignored
\


NOTES:
 planner: Adding new disk waterbase.uwm.edu:/.
 planner: Adding new disk waterbase.uwm.edu:/media/raid0.
 planner: Adding new disk waterbase.uwm.edu:/media/raid1.
 planner: Adding new disk waterbase.uwm.edu:/media/raid2.
 taper: mmap failed (Cannot allocate memory): using fallback split size 
of 262144kb to buffer waterbase.uwm.edu:/media/raid1.0 in-memory

 taper: tape GLWIBACK-001 kb 130547712 fm 499 writing file: short write
 taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 
130547712kb mark: [writing file: short write]

 taper: tape GLWIBACK-002 kb 135895488 fm 519 writing file: short write
 taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 
266338304kb mark: [writing file: short write]
 taper: mmap failed (Cannot allocate memory): using fallback split size 
of 262144kb to buffer waterbase.uwm.edu:/media/raid0.0 in-memory

 taper: tape GLWIBACK-003 kb 124064672 fm 474 writing file: short write
 taper: continuing

Re: Multi-tape span failure

2007-10-31 Thread Tom Hansen





Ian Turner wrote:

Tom,

What is runtapes set to?

--Ian

  


The runtapes parameter is set to 25.

-Tom






On Wednesday 31 October 2007 00:31:53 Tom Hansen wrote:
  

BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10,
configured to backup several large (300Gb +) filesystems spanning
several tapes.  I have a robot changer, LTO1 tapes (100Gb capacity) and
I used:

tape_splitsize 3Gb
fallback_splitsize 256m

(An unrelated issue: I couldn't seem to be able to get split_diskbuffer
to have any effect so the chunks were all 256mb.  No big deal, it was
not a bottleneck.)

After much time configuring, everything seems to be working properly,
and on my first big run, it successfully spanned six tapes and was
nearly finished.  Then it grabbed tape 7, which I had inadvertently left
in "write protect" mode.  Unfortunately, at this point Amanda completely
aborted the entire 800+ Gb backup and left nothing in the index, thus
completely wasting 7+ hours of backup time.

This behavior is unexpected and bad.  What if a tape simply goes bad
during a run? If I'm running 7 or 8 tapes each backup, I don't want to
lose the whole thing if there's an error on the last tape!

I _thought_ that Amanda was programmed to simply go to the next tape
when a tape error occurs.  In this case, if Amanda _had_ gone to the
next tape, it could have completed the job, since tape 8 was a good tape.

MY QUESTION:  Is there any way to configure Amanda such that such a tape
error would simply go to the next tape, instead of the worst possible
action, which is to abort the whole job?

Short of that, is there any way Amanda could start up from where it left
off?

Thanks.

--
Tom Hansen
Senior Information Processing Consultant
Great Lakes WATER Institute
tomh -at- uwm.edu
www.glwi.uwm.edu




--
Tom Hansen
Senior Information Processing Consultant
UWM Great Lakes WATER Institute
www.glwi.uwm.edu
[EMAIL PROTECTED]




Multi-tape span failure

2007-10-30 Thread Tom Hansen


BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, 
configured to backup several large (300Gb +) filesystems spanning 
several tapes.  I have a robot changer, LTO1 tapes (100Gb capacity) and 
I used:


   tape_splitsize 3Gb
   fallback_splitsize 256m

(An unrelated issue: I couldn't seem to be able to get split_diskbuffer 
to have any effect so the chunks were all 256mb.  No big deal, it was 
not a bottleneck.)


After much time configuring, everything seems to be working properly, 
and on my first big run, it successfully spanned six tapes and was 
nearly finished.  Then it grabbed tape 7, which I had inadvertently left 
in "write protect" mode.  Unfortunately, at this point Amanda completely 
aborted the entire 800+ Gb backup and left nothing in the index, thus 
completely wasting 7+ hours of backup time.


This behavior is unexpected and bad.  What if a tape simply goes bad 
during a run? If I'm running 7 or 8 tapes each backup, I don't want to 
lose the whole thing if there's an error on the last tape!


I _thought_ that Amanda was programmed to simply go to the next tape 
when a tape error occurs.  In this case, if Amanda _had_ gone to the 
next tape, it could have completed the job, since tape 8 was a good tape.


MY QUESTION:  Is there any way to configure Amanda such that such a tape 
error would simply go to the next tape, instead of the worst possible 
action, which is to abort the whole job?


Short of that, is there any way Amanda could start up from where it left 
off?


Thanks.

--
Tom Hansen
Senior Information Processing Consultant
Great Lakes WATER Institute
tomh -at- uwm.edu
www.glwi.uwm.edu