A stange thing happened to me over the past few days. I started a remote dump of about 66GB, which normally takes about 30-35 hours. However, I noticed on Sunday that it was crawling at less than 200kbps, apparently due to some temporary network issues. I killed the amdump process and did an amcleanup, which exited with some errors, I don't recall exactly. I then received the amanda email report that I expected showing the 3 small file systems that made it through the dump and the expected missing results for the huge (60GB+) one that didn't.

I waited until the network issues cleared up and started a new dump on a new virtual tape. It appears the amcleanup had deleted it from the tapelist, so I manually re-entered it with the presumed date that it was last used.

I soon got an AMANDA MAIL REPORT like this:

*** A TAPE ERROR OCCURRED: [No writable valid tape found].
Some dumps may have been left in the holding disk.
Run amflush to flush them to tape.
The next tape Amanda expects to use is: a new tape.

FAILURE AND STRANGE DUMP SUMMARY:
host.com /dev/mirror/gm0s1g lev 0 FAILED [can't switch to incremental dump]
 host.com  /dev/mirror/gm0s1e  lev 1  FAILED [service
/usr/local/libexec/amanda/sendbackup failed: pid 46130 exited with code 1]
 host.com  /dev/mirror/gm0s1e  lev 1  FAILED [cannot read header: got 0
instead of 32768]
 host.com  /dev/mirror/gm0s1a  lev 1  FAILED [cannot read header: got 0
instead of 32768]
 host.com  /dev/mirror/gm0s1e  lev 1  FAILED [cannot read header: got 0
instead of 32768]
 host.com  /dev/mirror/gm0s1e  lev 1  FAILED [too many dumper retry:
"[request failed: timeout waiting for ACK]"]
 host.com  /dev/mirror/gm0s1a  lev 1  was successfully retried

STATISTICS:
                          Total       Full      Incr.
                        --------   --------   --------
Estimate Time (hrs:min)    0:02
Run Time (hrs:min)         0:09
Dump Time (hrs:min)        0:04       0:00       0:04
Output Size (meg)          77.7        0.0       77.7
Original Size (meg)       251.5        0.0      251.5
Avg Compressed Size (%)    30.9        --        30.9   (level:#disks ...)
Filesystems Dumped            2          0          2   (1:2)
Avg Dump Rate (k/s)       352.1        --       352.1

Tape Time (hrs:min)        0:00       0:00       0:00
Tape Size (meg)             0.0        0.0        0.0
Tape Used (%)               0.0        0.0        0.0
Filesystems Taped             0          0          0

Chunks Taped                  0          0          0
Avg Tp Write Rate (k/s)     --         --         --

FAILED AND STRANGE DUMP DETAILS:

/--  host.com /dev/mirror/gm0s1e lev 1 FAILED [service
/usr/local/libexec/amanda/sendbackup failed: pid 46130 exited with code 1]
\--------

NOTES:
  planner: tapecycle (4) <= runspercycle (28)
  driver: WARNING: This is not the first amdump run today. Enable the
usetimestamps option in the configuration file if you want to run amdump
more than once per calendar day.
  planner: Last full dump of host.com:/dev/mirror/gm0s1a on tape
overwritten in 1 run.
  planner: Last full dump of host.com:/dev/mirror/gm0s1e on tape
overwritten in 1 run.
  planner: Last full dump of host.com:/dev/mirror/gm0s1f on tape
overwritten in 1 run.
  planner: Last full dump of host.com:/dev/mirror/gm0s1g on tape FULL2
overwritten in 3 runs.
  planner: Preventing bump of host.com:/dev/mirror/gm0s1g as directed.
  taper: slot 2: read label `FULL1', date `20100206'
  taper: cannot overwrite active tape FULL1
  taper: slot 3: read label `FULL2', date `20100213'
  taper: cannot overwrite active tape FULL2
  taper: slot 4: read label `FULL3', date `20100221'
taper: label FULL3 match labelstr but it not listed in the tapelist file.
  taper: slot 1: read label `FULL0', date `20100116'
  taper: cannot overwrite active tape FULL0
  taper: changer problem: 1 file:/home/amanda/dumps/slots
  big estimate: host.com /dev/mirror/gm0s1a 1
                est: 384k    out 37k
  big estimate: host.com /dev/mirror/gm0s1f 1
                est: 128608k    out 79530k


DUMP SUMMARY:
                                       DUMPER STATS               TAPER
STATS
HOSTNAME     DISK        L ORIG-kB  OUT-kB  COMP%  MMM:SS   KB/s MMM:SS
KB/s
-------------------------- -------------------------------------
-------------
host.com -ror/gm0s1a 1     518      37    7.1    0:23    1.6   N/A
N/A
host.com -ror/gm0s1e 1 FAILED
--------------------------------------------
host.com -ror/gm0s1f 1  257029   79530   30.9    3:23  391.4   N/A
N/A
host.com -ror/gm0s1g 0 FAILED
--------------------------------------------

(brought to you by Amanda version 2.5.1p3)

---

The next day, I did another amcheck, it looked ok, so I started a new amdump. It backed up the 3 smaller file systems, but not gm0s1g, the huge one:

These dumps were to tape FULL3.
The next tape Amanda expects to use is: FULL0.

FAILURE AND STRANGE DUMP SUMMARY:
host.com /dev/mirror/gm0s1g lev 0 FAILED [cannot read header: got 0
instead of 32768]
host.com /dev/mirror/gm0s1g lev 0 FAILED [cannot read header: got 0
instead of 32768]
  host.com  /dev/mirror/gm0s1g  lev 0  FAILED [too many dumper retry:
"[request failed: timeout waiting for REP]"]

-----SNIP-----

I then looked at the holding disk directory for the previous backup and saw why (I think). Amanda was still dumping the huge file system /dev/mirror/gm0s1g from the previous dump! I was sort of glad and let it go until it was finished, since this thing takes so long to start over yet again. I then got this one this morning:

Subject: Company AMANDA MAIL REPORT FOR BogusMonth 0, 0

The next tape Amanda expects to use is: FULL1.

FAILURE AND STRANGE DUMP SUMMARY:
  host.com  /dev/mirror/gm0s1a  RESULTS MISSING
  host.com  /dev/mirror/gm0s1e  RESULTS MISSING
  host.com  /dev/mirror/gm0s1f  RESULTS MISSING


STATISTICS:
                          Total       Full      Incr.
                        --------   --------   --------
Estimate Time (hrs:min)    0:00
Run Time (hrs:min)        41:57
Dump Time (hrs:min)       40:53      40:53       0:00
Output Size (meg)       66331.5    66331.5        0.0
Original Size (meg)     98000.6    98000.6        0.0
Avg Compressed Size (%)    67.7       67.7        --
Filesystems Dumped            1          1          0
Avg Dump Rate (k/s)       461.5      461.5        --

Tape Time (hrs:min)        0:00       0:00       0:00
Tape Size (meg)             0.0        0.0        0.0
Tape Used (%)               0.0        0.0        0.0
Filesystems Taped             0          0          0

Chunks Taped                  0          0          0
Avg Tp Write Rate (k/s)     --         --         --
-----

I would expect those results to be missing, since they had "completed" already. What I'd like to do is amflush the big file system's holding disk files, preferably to the same virtual tape as the others. I know there's a date stamp issue, but even using amflush -D (date on dir) didn't work (ignore the /tmp debug perms errors):

[ama...@amanda ~/dumps/20100221082556]$ amflush -D 20100221082556 weekly
chown(/tmp/amanda/server/amflush.20100223142504.debug, 2, 5) failed. <Operation not permitted>Scanning /home/amanda/dumps...
  slots: skipping cruft directory, perhaps you should delete it.
chg-disk-status-access: skipping cruft file, perhaps you should delete it. chg-disk-status-clean: skipping cruft file, perhaps you should delete it.
  chg-disk-status-slot: skipping cruft file, perhaps you should delete it.
  20100221082556: found Amanda directory.
Could not find any valid dump image, check directory.

Is this data salvagable?

James Smallacombe                     PlantageNet, Inc. CEO and Janitor
u...@3.am                                                           http://3.am
=========================================================================

Reply via email to