A stange thing happened to me over the past few days. I started a remote
dump of about 66GB, which normally takes about 30-35 hours. However, I
noticed on Sunday that it was crawling at less than 200kbps, apparently
due to some temporary network issues. I killed the amdump process and did
an amcleanup, which exited with some errors, I don't recall exactly. I
then received the amanda email report that I expected showing the 3 small
file systems that made it through the dump and the expected missing
results for the huge (60GB+) one that didn't.
I waited until the network issues cleared up and started a new dump on a
new virtual tape. It appears the amcleanup had deleted it from the
tapelist, so I manually re-entered it with the presumed date that it was
last used.
I soon got an AMANDA MAIL REPORT like this:
*** A TAPE ERROR OCCURRED: [No writable valid tape found].
Some dumps may have been left in the holding disk.
Run amflush to flush them to tape.
The next tape Amanda expects to use is: a new tape.
FAILURE AND STRANGE DUMP SUMMARY:
host.com /dev/mirror/gm0s1g lev 0 FAILED [can't switch to incremental
dump]
host.com /dev/mirror/gm0s1e lev 1 FAILED [service
/usr/local/libexec/amanda/sendbackup failed: pid 46130 exited with code 1]
host.com /dev/mirror/gm0s1e lev 1 FAILED [cannot read header: got 0
instead of 32768]
host.com /dev/mirror/gm0s1a lev 1 FAILED [cannot read header: got 0
instead of 32768]
host.com /dev/mirror/gm0s1e lev 1 FAILED [cannot read header: got 0
instead of 32768]
host.com /dev/mirror/gm0s1e lev 1 FAILED [too many dumper retry:
"[request failed: timeout waiting for ACK]"]
host.com /dev/mirror/gm0s1a lev 1 was successfully retried
STATISTICS:
Total Full Incr.
-------- -------- --------
Estimate Time (hrs:min) 0:02
Run Time (hrs:min) 0:09
Dump Time (hrs:min) 0:04 0:00 0:04
Output Size (meg) 77.7 0.0 77.7
Original Size (meg) 251.5 0.0 251.5
Avg Compressed Size (%) 30.9 -- 30.9 (level:#disks ...)
Filesystems Dumped 2 0 2 (1:2)
Avg Dump Rate (k/s) 352.1 -- 352.1
Tape Time (hrs:min) 0:00 0:00 0:00
Tape Size (meg) 0.0 0.0 0.0
Tape Used (%) 0.0 0.0 0.0
Filesystems Taped 0 0 0
Chunks Taped 0 0 0
Avg Tp Write Rate (k/s) -- -- --
FAILED AND STRANGE DUMP DETAILS:
/-- host.com /dev/mirror/gm0s1e lev 1 FAILED [service
/usr/local/libexec/amanda/sendbackup failed: pid 46130 exited with code 1]
\--------
NOTES:
planner: tapecycle (4) <= runspercycle (28)
driver: WARNING: This is not the first amdump run today. Enable the
usetimestamps option in the configuration file if you want to run amdump
more than once per calendar day.
planner: Last full dump of host.com:/dev/mirror/gm0s1a on tape
overwritten in 1 run.
planner: Last full dump of host.com:/dev/mirror/gm0s1e on tape
overwritten in 1 run.
planner: Last full dump of host.com:/dev/mirror/gm0s1f on tape
overwritten in 1 run.
planner: Last full dump of host.com:/dev/mirror/gm0s1g on tape FULL2
overwritten in 3 runs.
planner: Preventing bump of host.com:/dev/mirror/gm0s1g as directed.
taper: slot 2: read label `FULL1', date `20100206'
taper: cannot overwrite active tape FULL1
taper: slot 3: read label `FULL2', date `20100213'
taper: cannot overwrite active tape FULL2
taper: slot 4: read label `FULL3', date `20100221'
taper: label FULL3 match labelstr but it not listed in the tapelist
file.
taper: slot 1: read label `FULL0', date `20100116'
taper: cannot overwrite active tape FULL0
taper: changer problem: 1 file:/home/amanda/dumps/slots
big estimate: host.com /dev/mirror/gm0s1a 1
est: 384k out 37k
big estimate: host.com /dev/mirror/gm0s1f 1
est: 128608k out 79530k
DUMP SUMMARY:
DUMPER STATS TAPER
STATS
HOSTNAME DISK L ORIG-kB OUT-kB COMP% MMM:SS KB/s MMM:SS
KB/s
-------------------------- -------------------------------------
-------------
host.com -ror/gm0s1a 1 518 37 7.1 0:23 1.6 N/A
N/A
host.com -ror/gm0s1e 1 FAILED
--------------------------------------------
host.com -ror/gm0s1f 1 257029 79530 30.9 3:23 391.4 N/A
N/A
host.com -ror/gm0s1g 0 FAILED
--------------------------------------------
(brought to you by Amanda version 2.5.1p3)
---
The next day, I did another amcheck, it looked ok, so I started a new
amdump. It backed up the 3 smaller file systems, but not gm0s1g, the huge
one:
These dumps were to tape FULL3.
The next tape Amanda expects to use is: FULL0.
FAILURE AND STRANGE DUMP SUMMARY:
host.com /dev/mirror/gm0s1g lev 0 FAILED [cannot read header: got
0
instead of 32768]
host.com /dev/mirror/gm0s1g lev 0 FAILED [cannot read header: got
0
instead of 32768]
host.com /dev/mirror/gm0s1g lev 0 FAILED [too many dumper retry:
"[request failed: timeout waiting for REP]"]
-----SNIP-----
I then looked at the holding disk directory for the previous backup and
saw why (I think). Amanda was still dumping the huge file system
/dev/mirror/gm0s1g from the previous dump! I was sort of glad and let it
go until it was finished, since this thing takes so long to start over yet
again. I then got this one this morning:
Subject: Company AMANDA MAIL REPORT FOR BogusMonth 0, 0
The next tape Amanda expects to use is: FULL1.
FAILURE AND STRANGE DUMP SUMMARY:
host.com /dev/mirror/gm0s1a RESULTS MISSING
host.com /dev/mirror/gm0s1e RESULTS MISSING
host.com /dev/mirror/gm0s1f RESULTS MISSING
STATISTICS:
Total Full Incr.
-------- -------- --------
Estimate Time (hrs:min) 0:00
Run Time (hrs:min) 41:57
Dump Time (hrs:min) 40:53 40:53 0:00
Output Size (meg) 66331.5 66331.5 0.0
Original Size (meg) 98000.6 98000.6 0.0
Avg Compressed Size (%) 67.7 67.7 --
Filesystems Dumped 1 1 0
Avg Dump Rate (k/s) 461.5 461.5 --
Tape Time (hrs:min) 0:00 0:00 0:00
Tape Size (meg) 0.0 0.0 0.0
Tape Used (%) 0.0 0.0 0.0
Filesystems Taped 0 0 0
Chunks Taped 0 0 0
Avg Tp Write Rate (k/s) -- -- --
-----
I would expect those results to be missing, since they had "completed"
already. What I'd like to do is amflush the big file system's holding
disk files, preferably to the same virtual tape as the others. I know
there's a date stamp issue, but even using amflush -D (date on dir)
didn't work (ignore the /tmp debug perms errors):
[ama...@amanda ~/dumps/20100221082556]$ amflush -D 20100221082556 weekly
chown(/tmp/amanda/server/amflush.20100223142504.debug, 2, 5) failed.
<Operation not permitted>Scanning /home/amanda/dumps...
slots: skipping cruft directory, perhaps you should delete it.
chg-disk-status-access: skipping cruft file, perhaps you should delete
it.
chg-disk-status-clean: skipping cruft file, perhaps you should delete
it.
chg-disk-status-slot: skipping cruft file, perhaps you should delete it.
20100221082556: found Amanda directory.
Could not find any valid dump image, check directory.
Is this data salvagable?
James Smallacombe PlantageNet, Inc. CEO and Janitor
u...@3.am http://3.am
=========================================================================