Re: no backups since a week ago (data write: Connection reset by peer)
Thanks for all the help... I upgraded my amanda-server version to 2.5.2 and that seems to have fixed the problem. Jean-Louis Martineau wrote: Steven Settlemyre wrote: From amdump log: This similar flow happens every time it fails. At first I was worried about the not enough diskspace message, but this is just because my holding disk filled up. I see the same thing happening in successful runs. In some case, this error is fatal, this bug is fixed in newer release. driver: state time 8459.717 free kps: 37761 space: 55584 taper: idle idle-dumpers: 7 qlen tapeq: 0 runq: 18 roomq: 2 wakeup: 0 d river-idle: no-diskspace driver: interface-state time 8459.717 if : free 37761 driver: hdisk-state time 8459.717 hdisk 0: free 55584 dumpers 1 driver: result time 8459.727 from chunker3: RQ-MORE-DISK 03-4 find diskspace: not enough diskspace. Left with 508960 K find diskspace: not enough diskspace. Left with 71872 K find diskspace: not enough diskspace. Left with 17664 K find diskspace: not enough diskspace. Left with 508960 K driver: Don't know how to send ABORT command to chunker taper: DONE [idle wait: 7182.514 secs] chunker: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7311.263: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7311.263: pid 1129 finish time Thu May 10 02:06:02 2007 chunker: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.517: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.517: pid 31879 finish time Thu May 10 02:06:02 2007 chunker: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.485: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.485: pid 31880 finish time Thu May 10 02:06:02 2007 taper: writing end marker. [VOL14 OK kb 2300352 fm 28] dumper: kill index command dumper: kill index command dumper: kill index command amdump: end at Thu May 10 02:06:02 EDT 2007 Scanning /holding/amanda... 20070509234502: found Amanda directory.
Re: no backups since a week ago (data write: Connection reset by peer)
From amdump log: This similar flow happens every time it fails. At first I was worried about the not enough diskspace message, but this is just because my holding disk filled up. I see the same thing happening in successful runs. driver: state time 8459.717 free kps: 37761 space: 55584 taper: idle idle-dumpers: 7 qlen tapeq: 0 runq: 18 roomq: 2 wakeup: 0 d river-idle: no-diskspace driver: interface-state time 8459.717 if : free 37761 driver: hdisk-state time 8459.717 hdisk 0: free 55584 dumpers 1 driver: result time 8459.727 from chunker3: RQ-MORE-DISK 03-4 find diskspace: not enough diskspace. Left with 508960 K find diskspace: not enough diskspace. Left with 71872 K find diskspace: not enough diskspace. Left with 17664 K find diskspace: not enough diskspace. Left with 508960 K driver: Don't know how to send ABORT command to chunker taper: DONE [idle wait: 7182.514 secs] chunker: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7311.263: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7311.263: pid 1129 finish time Thu May 10 02:06:02 2007 chunker: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.517: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.517: pid 31879 finish time Thu May 10 02:06:02 2007 chunker: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.485: error [bad command after RQ-MORE-DISK: QUIT] chunker: time 7888.485: pid 31880 finish time Thu May 10 02:06:02 2007 taper: writing end marker. [VOL14 OK kb 2300352 fm 28] dumper: kill index command dumper: kill index command dumper: kill index command amdump: end at Thu May 10 02:06:02 EDT 2007 Scanning /holding/amanda... 20070509234502: found Amanda directory. Gene Heskett wrote: On Wednesday 09 May 2007, Steven Settlemyre wrote: Can someone please help me? Steven Settlemyre wrote: I haven't changed my configs for months and things were running great until last week. Since last tues, none of my dailies have finished, and last night a monthly failed. Looking through the logs I see the problem always seems to start with data write: Connection reset by peer and Don't know how to send ABORT command to chunker. I'm having a hard time interpreting the logs and can't seem to find too much in the archives about this. Was wondering if someone could walk me through an explanation of the problem and how to avoid it in the future. My monthlies run tape spanning on 3 40G tapes. Here is the email output generated: *** THE DUMPS DID NOT FINISH PROPERLY! These dumps were to tape Monthly21. The next 3 tapes Amanda expects to use are: Monthly01, Monthly02, Monthly03. The next 3 new tapes already labelled are: Monthly19, Monthly20, Monthly22. FAILURE AND STRANGE DUMP SUMMARY: wagstaff /usr/locallev 1 FAILED [data write: Connection reset by peer] lollipop /files1 lev 0 FAILED [data write: Connection reset by peer] helios /files3 lev 1 FAILED [data write: Connection reset by peer] helios / RESULTS MISSING helios /files2 RESULTS MISSING helios /usr RESULTS MISSING helios /usr/localRESULTS MISSING helios /var RESULTS MISSING lollipop / RESULTS MISSING lollipop /usr RESULTS MISSING lollipop /usr/localRESULTS MISSING wagstaff /files3 RESULTS MISSING wagstaff /files4 RESULTS MISSING wagstaff /files5 RESULTS MISSING wagstaff /files6/vol/Voiceware RESULTS MISSING wizard/files2 RESULTS MISSING snapserver /hd/vol_mnt0/shares/TermLab RESULTS MISSING snapserver /hd/vol_mnt0/shares/bcl RESULTS MISSING snapserver /hd/vol_mnt0/shares/biochem RESULTS MISSING snapserver /hd/vol_mnt0/shares/confocal RESULTS MISSING driver: FATAL Don't know how to send ABORT command to chunker chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] STATISTICS: Total Full Incr. Estimate Time (hrs:min)0:08 Run Time (hrs:min) 1:01 Dump Time (hrs:min)1:55 1:40 0:16 Output Size (meg)8519.7 7729.7 790.1 Original Size (meg) 13146.311595.5 1550.8 Avg Compressed Size (%)64.8 66.7 50.9 (level:#disks ...) Filesystems Dumped 35 12 23 (1:23) Avg Dump Rate (k/s) 1261.0 1323.3 863.1 Tape Time (hrs:min)0:53 0:44 0:09 Tape Size (meg) 8521.6 7730.3 791.3 Tape Used
amanda-client version
How do I find out which version of amanda-client is running on my hosts? I found the debian ones through the package manager, but wonder if there's a command-line switch to use? Steve
Please help... no backups since a week ago (data write: Connection reset by peer)
Can someone please help me? Steven Settlemyre wrote: I haven't changed my configs for months and things were running great until last week. Since last tues, none of my dailies have finished, and last night a monthly failed. Looking through the logs I see the problem always seems to start with data write: Connection reset by peer and Don't know how to send ABORT command to chunker. I'm having a hard time interpreting the logs and can't seem to find too much in the archives about this. Was wondering if someone could walk me through an explanation of the problem and how to avoid it in the future. My monthlies run tape spanning on 3 40G tapes. Here is the email output generated: *** THE DUMPS DID NOT FINISH PROPERLY! These dumps were to tape Monthly21. The next 3 tapes Amanda expects to use are: Monthly01, Monthly02, Monthly03. The next 3 new tapes already labelled are: Monthly19, Monthly20, Monthly22. FAILURE AND STRANGE DUMP SUMMARY: wagstaff /usr/locallev 1 FAILED [data write: Connection reset by peer] lollipop /files1 lev 0 FAILED [data write: Connection reset by peer] helios /files3 lev 1 FAILED [data write: Connection reset by peer] helios / RESULTS MISSING helios /files2 RESULTS MISSING helios /usr RESULTS MISSING helios /usr/localRESULTS MISSING helios /var RESULTS MISSING lollipop / RESULTS MISSING lollipop /usr RESULTS MISSING lollipop /usr/localRESULTS MISSING wagstaff /files3 RESULTS MISSING wagstaff /files4 RESULTS MISSING wagstaff /files5 RESULTS MISSING wagstaff /files6/vol/Voiceware RESULTS MISSING wizard/files2 RESULTS MISSING snapserver /hd/vol_mnt0/shares/TermLab RESULTS MISSING snapserver /hd/vol_mnt0/shares/bcl RESULTS MISSING snapserver /hd/vol_mnt0/shares/biochem RESULTS MISSING snapserver /hd/vol_mnt0/shares/confocal RESULTS MISSING driver: FATAL Don't know how to send ABORT command to chunker chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] STATISTICS: Total Full Incr. Estimate Time (hrs:min)0:08 Run Time (hrs:min) 1:01 Dump Time (hrs:min)1:55 1:40 0:16 Output Size (meg)8519.7 7729.7 790.1 Original Size (meg) 13146.311595.5 1550.8 Avg Compressed Size (%)64.8 66.7 50.9 (level:#disks ...) Filesystems Dumped 35 12 23 (1:23) Avg Dump Rate (k/s) 1261.0 1323.3 863.1 Tape Time (hrs:min)0:53 0:44 0:09 Tape Size (meg) 8521.6 7730.3 791.3 Tape Used (%) 21.1 19.02.1 (level:#disks ...) Filesystems Taped35 12 23 (1:23) (level:#chunks ...) Chunks Taped 35 12 23 (1:23) Avg Tp Write Rate (k/s) 2724.3 3000.8 1433.6 USAGE BY TAPE: Label Time Size %NbNc Monthly21 0:53 8726112k 21.13535 FAILED AND STRANGE DUMP DETAILS: /-- wagstaff /usr/local lev 1 FAILED [data write: Connection reset by peer] sendbackup: start [wagstaff level 1] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/sbin/ufsrestore -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 1 dump: Tue May 08 01:11:26 2007 | DUMP: Date of last level 0 dump: Mon Apr 30 23:54:14 2007 | DUMP: Dumping /dev/rdsk/c0t0d0s7 (wagstaff:/usr/local) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Mapping (Pass II) [directories] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 13585968 blocks (6633.77MB) on 0.10 tapes. | DUMP: Dumping (Pass III) [directories] | DUMP: Dumping (Pass IV) [regular files] | DUMP: 16.49% done, finished in 0:50 | DUMP: 28.34% done, finished in 0:57 | DUMP: 38.89% done, finished in 1:12 \ /-- lollipop /files1 lev 0 FAILED [data write: Connection reset by peer] sendbackup: start [lollipop level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/sbin/ufsrestore -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Mon May 07 23:52:56 2007
dumps failing recently (data write: Connection reset by peer)
I haven't changed my configs for months and things were running great until last week. Since last tues, none of my dailies have finished, and last night a monthly failed. Looking through the logs I see the problem always seems to start with data write: Connection reset by peer and Don't know how to send ABORT command to chunker. I'm having a hard time interpreting the logs and can't seem to find too much in the archives about this. Was wondering if someone could walk me through an explanation of the problem and how to avoid it in the future. My monthlies run tape spanning on 3 40G tapes. Here is the email output generated: *** THE DUMPS DID NOT FINISH PROPERLY! These dumps were to tape Monthly21. The next 3 tapes Amanda expects to use are: Monthly01, Monthly02, Monthly03. The next 3 new tapes already labelled are: Monthly19, Monthly20, Monthly22. FAILURE AND STRANGE DUMP SUMMARY: wagstaff /usr/locallev 1 FAILED [data write: Connection reset by peer] lollipop /files1 lev 0 FAILED [data write: Connection reset by peer] helios /files3 lev 1 FAILED [data write: Connection reset by peer] helios / RESULTS MISSING helios /files2 RESULTS MISSING helios /usr RESULTS MISSING helios /usr/localRESULTS MISSING helios /var RESULTS MISSING lollipop / RESULTS MISSING lollipop /usr RESULTS MISSING lollipop /usr/localRESULTS MISSING wagstaff /files3 RESULTS MISSING wagstaff /files4 RESULTS MISSING wagstaff /files5 RESULTS MISSING wagstaff /files6/vol/Voiceware RESULTS MISSING wizard/files2 RESULTS MISSING snapserver /hd/vol_mnt0/shares/TermLab RESULTS MISSING snapserver /hd/vol_mnt0/shares/bcl RESULTS MISSING snapserver /hd/vol_mnt0/shares/biochem RESULTS MISSING snapserver /hd/vol_mnt0/shares/confocal RESULTS MISSING driver: FATAL Don't know how to send ABORT command to chunker chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] STATISTICS: Total Full Incr. Estimate Time (hrs:min)0:08 Run Time (hrs:min) 1:01 Dump Time (hrs:min)1:55 1:40 0:16 Output Size (meg)8519.7 7729.7 790.1 Original Size (meg) 13146.311595.5 1550.8 Avg Compressed Size (%)64.8 66.7 50.9 (level:#disks ...) Filesystems Dumped 35 12 23 (1:23) Avg Dump Rate (k/s) 1261.0 1323.3 863.1 Tape Time (hrs:min)0:53 0:44 0:09 Tape Size (meg) 8521.6 7730.3 791.3 Tape Used (%) 21.1 19.02.1 (level:#disks ...) Filesystems Taped35 12 23 (1:23) (level:#chunks ...) Chunks Taped 35 12 23 (1:23) Avg Tp Write Rate (k/s) 2724.3 3000.8 1433.6 USAGE BY TAPE: Label Time Size %NbNc Monthly21 0:53 8726112k 21.13535 FAILED AND STRANGE DUMP DETAILS: /-- wagstaff /usr/local lev 1 FAILED [data write: Connection reset by peer] sendbackup: start [wagstaff level 1] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/sbin/ufsrestore -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 1 dump: Tue May 08 01:11:26 2007 | DUMP: Date of last level 0 dump: Mon Apr 30 23:54:14 2007 | DUMP: Dumping /dev/rdsk/c0t0d0s7 (wagstaff:/usr/local) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Mapping (Pass II) [directories] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 13585968 blocks (6633.77MB) on 0.10 tapes. | DUMP: Dumping (Pass III) [directories] | DUMP: Dumping (Pass IV) [regular files] | DUMP: 16.49% done, finished in 0:50 | DUMP: 28.34% done, finished in 0:57 | DUMP: 38.89% done, finished in 1:12 \ /-- lollipop /files1 lev 0 FAILED [data write: Connection reset by peer] sendbackup: start [lollipop level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/sbin/ufsrestore -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Mon May 07 23:52:56 2007 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping
Re: amrestore help
I'd love to, but see my message below. I get an error when I try to run amrecover. Is there a way to get past this error? Guy Dallaire wrote: amrecover is easier to use 2007/4/27, Steven Settlemyre [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]: actually, i found how to put the desired tape into the active slot of the changer. But when i do amrestore /dev/nst0 lollipop /files1 I get this: amrestore: missing file header block amrestore: 2: skipping zoo._.20070419.1.1 amrestore: missing file header block amrestore: 6: skipping marlin._var.20070419.1.1 amrestore: 10: reached end of information What now? Steven Settlemyre wrote: I am trying to restore a folder from a tape in a previous cycle (last night had a level 0 of the disk, but the folder was deleted prior). I have a tape changer and read that amrestore doesnt work with tape changers. My dumptypes are gnutar, so I probably can't use amrecover. Is this correct? When I tried to run it, it said amrecover: Unexpected end of file, check amindexd*debug on server localhost. I've looked all over and can't seem to find what I need (possibly because I've never recovered before and don't know what to look for). Steve
which report do I believe?
I am looking to restore a disk from one of my machines and am having a little trouble understanding which report to believe. Using amadmin find, I see there was a level 0 on 4/12 and again on 4/19. When I look at the daily reports I see on 4/12: pop:/files1.0 in-memory taper: no split_diskbuffer specified: using fallback split size of 10240kb to buffer and HOSTNAME DISKLORIG-kB OUT-kB COMP% MMM:SS KB/s MMM:SS KB/s -- --- - pop /files1 0 170241929961504 58.5 178:38 926.5 178:39 929.4 Then on 4/19 i see: NOTES: planner: Incremental of pop:/files1 bumped to level 3. and HOSTNAME DISKLORIG-kB OUT-kB COMP% MMM:SS KB/s MMM:SS KB/s -- --- - pop /files1 0 170891209965664 58.3 148:35 1117.9 30:39 5418.9 So is 4/19 a level 3 or level 0? Also, in the amadmin find, I see that 4/12 has 973 parts, whereas 4/19 only has 2 parts. Why the big difference? What could cause such things? and what steps should i take to restore this disk?
Re: which report do I believe?
I did not change any configuration between the 2 runs. I do not have split_diskbuffer set at all. Jon LaBadie wrote: Expanding on JLM's accurate comment. On Mon, Apr 30, 2007 at 11:07:45AM -0400, Steven Settlemyre wrote: I am looking to restore a disk from one of my machines and am having a little trouble understanding which report to believe. Using amadmin find, I see there was a level 0 on 4/12 and again on 4/19. When I look at the daily reports I see on 4/12: pop:/files1.0 in-memory taper: no split_diskbuffer specified: using fallback split size of 10240kb to buffer I'm guessing you did specify a split_diskbuffer later on, before 4/19. Since no split_diskbuffer was specified, tiny pieces of 10MB were used resulting in 973 pieces. and HOSTNAME DISKLORIG-kB OUT-kB COMP% MMM:SS KB/s MMM:SS KB/s -- --- - pop /files1 0 170241929961504 58.5 178:38 926.5 178:39 929.4 Then on 4/19 i see: NOTES: planner: Incremental of pop:/files1 bumped to level 3. Planning goes through many steps, one is to decide whether to bump to a higher level if an incremental is done. It was planning to do so. But at a later step, other considerations superceded this because: and HOSTNAME DISKLORIG-kB OUT-kB COMP% MMM:SS KB/s MMM:SS KB/s -- --- - pop /files1 0 170891209965664 58.3 148:35 1117.9 30:39 5418.9 So is 4/19 a level 3 or level 0? As clearly shown, a level 0 was done. Would you really expect your level 3 incremental, not even a level 1 or 2, to be the same size as you 4/12 level 0? Also, in the amadmin find, I see that 4/12 has 973 parts, whereas 4/19 only has 2 parts. Why the big difference? What could cause such things? and what steps should i take to restore this disk? Two pieces or 1000 would be no different.
amrestore help
I am trying to restore a folder from a tape in a previous cycle (last night had a level 0 of the disk, but the folder was deleted prior). I have a tape changer and read that amrestore doesnt work with tape changers. My dumptypes are gnutar, so I probably can't use amrecover. Is this correct? When I tried to run it, it said amrecover: Unexpected end of file, check amindexd*debug on server localhost. I've looked all over and can't seem to find what I need (possibly because I've never recovered before and don't know what to look for). Steve
Re: amrestore help
actually, i found how to put the desired tape into the active slot of the changer. But when i do amrestore /dev/nst0 lollipop /files1 I get this: amrestore: missing file header block amrestore: 2: skipping zoo._.20070419.1.1 amrestore: missing file header block amrestore: 6: skipping marlin._var.20070419.1.1 amrestore: 10: reached end of information What now? Steven Settlemyre wrote: I am trying to restore a folder from a tape in a previous cycle (last night had a level 0 of the disk, but the folder was deleted prior). I have a tape changer and read that amrestore doesnt work with tape changers. My dumptypes are gnutar, so I probably can't use amrecover. Is this correct? When I tried to run it, it said amrecover: Unexpected end of file, check amindexd*debug on server localhost. I've looked all over and can't seem to find what I need (possibly because I've never recovered before and don't know what to look for). Steve
results missing
I finally got past the gnutar problem I was having. I ran amcheck and everything looked good, but amdump gives me a bunch of results missing errors (even for hosts that worked fine before). Attached is the output from amdump. Thanks, Steve *** THE DUMPS DID NOT FINISH PROPERLY! These dumps were to tape VOL24. The next tape Amanda expects to use is: VOL01. FAILURE AND STRANGE DUMP SUMMARY: helios.medsci.domain /var lev 2 STRANGE snapserver.medsci.domain /hd/vol_mnt0/shares/biochem lev 1 FAILED [data write: Broken pipe] wagstaff.asel.domain /usr/local lev 1 FAILED [data write: Connection reset by peer] helios.medsci.domain /usr/local RESULTS MISSING lollipop.asel.domain /files1 RESULTS MISSING wagstaff.asel.domain /RESULTS MISSING wagstaff.asel.domain /files1 RESULTS MISSING wagstaff.asel.domain /files3 RESULTS MISSING wagstaff.asel.domain /files4 RESULTS MISSING wagstaff.asel.domain /files5 RESULTS MISSING wagstaff.asel.domain /files6/vol/SynthRESULTS MISSING wagstaff.asel.domain /files6/vol/VoicewareRESULTS MISSING wagstaff.asel.domain /files6/vol/bvd RESULTS MISSING wagstaff.asel.domain /files6/vol/spdata1 RESULTS MISSING wagstaff.asel.domain /files6/vol/spdata2 RESULTS MISSING wagstaff.asel.domain /files6/vol/speech7 RESULTS MISSING wagstaff.asel.domain /usr RESULTS MISSING wizard.asel.domain/var/mailRESULTS MISSING wizard.asel.domain/files2 RESULTS MISSING snapserver.medsci.domain /hd/vol_mnt0/shares/bcl RESULTS MISSING driver: FATAL Don't know how to send ABORT command to chunker chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT] STATISTICS: Total Full Incr. Estimate Time (hrs:min)0:10 Run Time (hrs:min) 0:14 Dump Time (hrs:min)0:14 0:00 0:14 Output Size (meg) 765.70.0 765.7 Original Size (meg) 1948.10.0 1948.1 Avg Compressed Size (%)39.3--39.3 (level:#disks ...) Filesystems Dumped 36 0 36 (1:35 2:1) Avg Dump Rate (k/s) 966.6-- 966.6 Tape Time (hrs:min)0:04 0:00 0:04 Tape Size (meg) 767.70.0 767.7 Tape Used (%) 2.10.02.1 (level:#disks ...) Filesystems Taped36 0 36 (1:35 2:1) (level:#chunks ...) Chunks Taped 36 0 36 (1:35 2:1) Avg Tp Write Rate (k/s) 3038.4-- 3038.4 USAGE BY TAPE: Label Time Size %NbNc VOL24 0:04 786080k2.13636 FAILED AND STRANGE DUMP DETAILS: /-- helios.medsci.domain /var lev 2 STRANGE sendbackup: start [helios.medsci.domain:/var level 2] sendbackup: info BACKUP=/bin/tar sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end ? gtar: ./lib/amanda/asel/index/professor.asel.domain/_usr_local/20061026_1.gz.tmp: Warning: Cannot stat: No such file or directory ? gtar: ./lib/amanda/asel/index/snapserver.medsci.domain/_hd_vol__mnt0_shares_NeuroGen/20061026_1.gz.tmp: Warning: Cannot stat: No such file or directory | gtar: ./run/postgresql/.s.PGSQL.5432: socket ignored | Total bytes written: 1115033600 (1.1GiB, 6.5MiB/s) sendbackup: size 1088900 sendbackup: end \ /-- snapserver.medsci.domain /hd/vol_mnt0/shares/biochem lev 1 FAILED [data write: Broken pipe] sendbackup: start [snapserver.medsci.domain:/hd/vol_mnt0/shares/biochem level 1] sendbackup: info BACKUP=/bin/tar sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end \ /-- wagstaff.asel.domain /usr/local lev 1 FAILED [data write: Connection reset by peer] sendbackup: start [wagstaff.asel.domain:/usr/local level 1] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/sbin/ufsrestore -f... - sendbackup: info COMPRESS_SUFFIX=.gz sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 1 dump: Thu Oct 26 23:55:10 2006 | DUMP: Date of last level 0 dump: Mon Oct 23 23:55:43 2006 | DUMP: Dumping /dev/rdsk/c0t0d0s7 (wagstaff:/usr/local) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Mapping (Pass II) [directories] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 11706368
question about new disks
I recently broke a disk up into subfolders to (hopefully) allow amanda to do a better job fitting things onto daily single-tape backups. But I keep getting the following errors. HOSTNAME /files6/vol/speech7lev 0 FAILED [disk /files6/vol/speech7, all estimate failed] HOSTNAME /files6/vol/spdata2lev 0 FAILED [disk /files6/vol/spdata2, all estimate failed] HOSTNAME /files6/vol/spdata1lev 0 FAILED [disk /files6/vol/spdata1, all estimate failed] HOSTNAME /files6/vol/bvdlev 0 FAILED [disk /files6/vol/bvd, all estimate failed] HOSTNAME /files6/vol/Voiceware lev 0 FAILED [disk /files6/vol/Voiceware, all estimate failed] HOSTNAME /files6/vol/Synth lev 0 FAILED [disk /files6/vol/Synth, all estimate failed] Is there something else I had to do besides change the disklist file. BTW, the dumptype is define dumptype asel-gnutar { comment ASEL backups using gnutar options compress-fast program GNUTAR tape_splitsize 5 Gb exclude list /etc/backup-excludes.txt index yes } Thanks, Steve Also (unrelated), I have sent a few emails to the list that never seem to make it out. Is this a known issue?
Re: question about new disks
Server: Debian (sarge) Linux. amanda version: 2.5.0p2 Client: tar: (GNU tar) 1.13.19 amanda-client: ? how do i find this? Paul Bijnens wrote: On 2006-10-20 15:05, Steven Settlemyre wrote: I recently broke a disk up into subfolders to (hopefully) allow amanda to do a better job fitting things onto daily single-tape backups. But I keep getting the following errors. HOSTNAME /files6/vol/speech7lev 0 FAILED [disk /files6/vol/speech7, all estimate failed] [...] Any useful error message in one of the debug files: have a look in: amdump.1 (in directory ~amanda/TheConfig) planner.datetime.debug (only 2.5.1+; in /tmp/amanda probably) What version of Amanda do have have on client/server. And version of gnutar on client?
Re: question about new disks
someone hinted to me that since my client is a solaris machine, the GNUTAR program specified in my dumptype might not be correct. I know on my Linux machines, gnutar is /bin/tar, but my solaris machine is at /usr/local/bin/tar. could this be the problem? also, how would I get my dumptype to use this path? Thanks Jon LaBadie wrote: On Fri, Oct 20, 2006 at 10:15:04AM -0400, Steven Settlemyre wrote: Server: Debian (sarge) Linux. amanda version: 2.5.0p2 Client: tar: (GNU tar) 1.13.19 amanda-client: ? how do i find this? On any amanda installation, amadmin - version For amanda developers, an RFE, amadmin syntax is: amadmin config cmd cmd_args So for the version cmd you have to supply a config argument. Thus my - in the above command line. It would make sense (to me) to allow the version cmd as an exception to the general syntax and allow amadmin version (i.e. with or without a config named). Config doesn't make much sense for the version cmd. Nor on clients prior to recent releases.
Re: question about new disks
actually, i couldnt find amadmin anywhere on my client. I did just come across the fact that /bin/tar is where GNUTAR is looking. I now see it by using your command (on the amanda server), but i found it by doing strings /usr/lib/amanda/runtar. I guess the easiest way for me to fix this problem is to make a symlink from the /usr/local/bin/tar to /bin/tar (of course I'll backup the existing tar first). With a little foresight, the best way to handle this is to configure amanda for gnutar somewhere in the amanda dir and create a symlink on each client machine? Thanks again. Steve Jon LaBadie wrote: On Fri, Oct 20, 2006 at 12:07:03PM -0400, Steven Settlemyre wrote: someone hinted to me that since my client is a solaris machine, the GNUTAR program specified in my dumptype might not be correct. I know on my Linux machines, gnutar is /bin/tar, but my solaris machine is at /usr/local/bin/tar. could this be the problem? also, how would I get my dumptype to use this path? Thanks Jon LaBadie wrote: On Fri, Oct 20, 2006 at 10:15:04AM -0400, Steven Settlemyre wrote: Server: Debian (sarge) Linux. amanda version: 2.5.0p2 Client: tar: (GNU tar) 1.13.19 amanda-client: ? how do i find this? On any amanda installation, amadmin - version You thanked me but probably did not run my suggestion. Lots of other things print out with the version cmd including the path to gnutar that will be used. It is hardcoded at compile time, not a runtime setting. For that reason I always run configure with the option to specify where gnutar lives and say it lives in /usr/local/libexec/amgtar. Then I make amgtar be a copy or a link of whichever tar I wish to use.
new disks failed
I just added a few new disks to my backup (actually just granulated the /files6 disk), and each of them failed with the following error. HOSTNAME /files6/vol/speech7 lev 0 FAILED [disk /files6/vol/speech7, all estimate failed] HOSTNAME /files6/vol/spdata2 lev 0 FAILED [disk /files6/vol/spdata2, all estimate failed] HOSTNAME /files6/vol/spdata1 lev 0 FAILED [disk /files6/vol/spdata1, all estimate failed] HOSTNAME /files6/vol/bvd lev 0 FAILED [disk /files6/vol/bvd, all estimate failed] HOSTNAME /files6/vol/Voiceware lev 0 FAILED [disk /files6/vol/Voiceware, all estimate failed] HOSTNAME /files6/vol/Synth lev 0 FAILED [disk /files6/vol/Synth, all estimate failed] HOSTNAME /hd/vol_mnt0/shares/sysadmin lev 1 FAILED [cannot read header: got 0 instead of 32768] Is there something special I have to do to add new disks?
Looooooooong backup
I have a monthly (full) backup running for about 22 hrs now. Do you think there is a problem, or is it possible it's just taking a long time? about 150G of data. Steve
Re: Looooooooong backup
looks ok, but i'm not too experienced reading these See attached. Jon LaBadie wrote: On Wed, Oct 18, 2006 at 01:14:57PM -0400, Steven Settlemyre wrote: I have a monthly (full) backup running for about 22 hrs now. Do you think there is a problem, or is it possible it's just taking a long time? about 150G of data. Have you tried a few amstatus cmds to see if it is still progressing? HOSTNAME:/1 210k finished (13:43:44) HOSTNAME:/files2 1 163k finished (13:43:32) HOSTNAME:/files3 1 2069k finished (13:44:05) HOSTNAME:/usr 0 633514k finished (13:58:13) HOSTNAME:/usr/local 1 343k finished (13:43:22) HOSTNAME:/var 1 607265k finished (13:55:33) HOSTNAME:/029748k finished (16:11:36) HOSTNAME:/files1 0 8356982k finished (16:11:24) HOSTNAME:/usr 12k finished (16:11:39) HOSTNAME:/usr/local 16k finished (16:11:38) HOSTNAME:/1 43k finished (14:01:35) HOSTNAME:/files1 1 388k finished (14:01:30) HOSTNAME:/usr 1 8231k finished (14:01:26) HOSTNAME:/var 0 902479k finished (13:52:48) HOSTNAME:/1 130k finished (13:43:34) HOSTNAME:/usr 1 581k finished (13:43:47) HOSTNAME:/usr/local 1 143k finished (13:43:23) HOSTNAME:/0 no estimate HOSTNAME:/hd/vol_mnt0/shares/BioInf 0 16717144k dumping to tape (12:49:32) HOSTNAME:/hd/vol_mnt0/shares/MID 1 48k finished (13:45:28) HOSTNAME:/hd/vol_mnt0/shares/MolGene 01k finished (14:01:39) HOSTNAME:/hd/vol_mnt0/shares/NeuroGen 18k finished (13:43:21) HOSTNAME:/hd/vol_mnt0/shares/TermLab 0 19943829k finished (7:59:15) HOSTNAME:/hd/vol_mnt0/shares/admin01k finished (13:45:25) HOSTNAME:/hd/vol_mnt0/shares/bcl 0 11817155k finished (17:22:57) HOSTNAME:/hd/vol_mnt0/shares/biochem 1 13k finished (13:44:01) HOSTNAME:/hd/vol_mnt0/shares/confocal 01k finished (13:44:56) HOSTNAME:/hd/vol_mnt0/shares/histo01k finished (13:46:13) HOSTNAME:/hd/vol_mnt0/shares/immuno 1 44k finished (13:43:28) HOSTNAME:/hd/vol_mnt0/shares/mcard01k finished (13:46:10) HOSTNAME:/hd/vol_mnt0/shares/mysql01k finished (13:44:53) HOSTNAME:/hd/vol_mnt0/shares/neurosci 1 7331k finished (13:44:43) HOSTNAME:/hd/vol_mnt0/shares/surgery 11k finished (13:43:43) HOSTNAME:/hd/vol_mnt0/shares/sysadmin 0 1990k finished (14:01:28) HOSTNAME:/hd/vol_mnt0/shares/urology 123734k finished (13:46:08) HOSTNAME:/1 13k finished (14:01:36) HOSTNAME:/files1 0 9512632k finished (19:40:32) HOSTNAME:/files3 1 565385k finished (14:01:07) HOSTNAME:/files4 17k finished (14:01:37) HOSTNAME:/files5 0 19194467k finished (12:49:32) HOSTNAME:/files6 0 42482791k finished (6:09:01) HOSTNAME:/usr 11k finished (14:01:40) HOSTNAME:/usr/local 184929k finished (14:01:23) HOSTNAME:/1 15k finished (14:21:23) HOSTNAME:/files1 18k finished (14:21:24) HOSTNAME:/files2 1 278233k finished (14:32:04) HOSTNAME:/usr 11k finished (14:21:25) HOSTNAME:/usr/local 120995k finished (14:21:21) HOSTNAME:/var 167529k finished (14:21:02) HOSTNAME:/var/log 067191k finished (14:21:17) HOSTNAME:/var/mail1 1325234k finished (14:20:46) SUMMARY part real estimated size size partition : 51 estimated : 50125731490k flush : 0 0k failed : 10k ( 0.00%) wait for dumping: 00k ( 0.00%) dumping to tape : 1 16717144k ( 13.30%) dumping : 0 0k 0k ( 0.00%) ( 0.00%) dumped : 50 132653031k 125731490k (105.51%) (105.51%) wait for writing: 0 0k 0k ( 0.00%) ( 0.00%) wait to flush : 0 0k 0k (100.00%) ( 0.00%) writing to tape : 0 0k 0k ( 0.00%) ( 0.00%) failed to tape : 0 0k 0k ( 0.00%) ( 0.00%) taped : 49 115935887k 109014346k (106.35
Re: Looooooooong backup
Thanks for all the responses. It finished about 10 mins after I sent the last message, but it did run out of tape(s)... But that's another issue. Steffan Vigano wrote: Sometimes on my FreeBSD box 'amstatus' output doesn't change during really long dumps. I've found the best way to find out if it's still running is to use the 'ps' command and grep for the phrase 'done' ie: 'ps -ax | grep done' Dump does a good job of reporting back how far it thinks it is in the process. ie:78% done -S
tape order
I have 24 tapes and a 8 tape changer. For some reason, it is going 13-16-15-14-17. How can I fix this? can i just force it to take 14 after 13 by only having 14 in there when it's expecting 16? My tapecycle is 10. Steve
duplicate data?
in my existing amanda setup, i have /usr and /usr/local in the disklist for the same machine. They are separate disks (or partitions at least). Is this correct, or does /usr include /usr/local?
Re: tape order
right, but since I have 24 tapes and my dumpcycle is 1 week, I should never have to go that far back as to need these tapes. Jon LaBadie wrote: On Tue, Oct 17, 2006 at 11:44:53AM -0400, Steven Settlemyre wrote: I have 24 tapes and a 8 tape changer. For some reason, it is going 13-16-15-14-17. How can I fix this? can i just force it to take 14 after 13 by only having 14 in there when it's expecting 16? My tapecycle is 10. As Michael suggests, your human ordering is VERY likely to get out of sequence. Best to forget about it. If you insist, amanda looks for exact match for the sequence amanda has been using, new tapes (labeled but never before by amanda since labeling), and tapes that are not one of the last tapecycle (minus 1 I think) number of tapes used. So, if you want it to use tape 14, and it has not been used in the last tapecycle number of tapes, make sure that no other tapes in the changer fit the three categories above. Recognize too that there are probably some level 0 DLE dumps on tape 14 and will be overwritten early. The recovery value of the incrementals on tape 17, etc., that are based on those level 0's is reduced.
Re: labelling of reused tapes
None of my tapes are new. I put the next 3 tapes that I have in the changer and run amcheck. It seems to be ok with 2 of the 3 tapes but gives me not an amanda tape (Input/output error) for slot 3. So I read the archives and they say to label the tape, so I run amlabel: BASH/usr/sbin/amlabel -f monthly Monthly53 slot 3 changer: got exit: 0 str: 3 /dev/nst0 labeling tape in slot 3 (/dev/nst0): rewinding amlabel: tape_rewind: rewinding tape: /dev/nst0: Input/output error amlabel: tape_rewind: rewinding tape: /dev/nst0: Input/output error amlabel: pid 17508 finish time Wed Oct 11 09:40:36 2006 Is this a tape problem or am I approaching this incorrectly? Steve Matt Hyclak wrote: On Tue, Oct 10, 2006 at 03:00:32PM -0400, Steven Settlemyre enlightened us: I have a monthly backup config with runtapes=3 and 59 tapes in the tapecycle. The instructions I was originally given say to run a script that looks in tapelist, find the highest # then label the next 3 tapes starting from there. This seems wrong because then it wouldn't be a cycle. My question is when I do amcheck, it tells me (expecting tape Monthly04 or a new tape). But the next tape I have in my hand is currently labeled Monthly50 which is correct because it is in the tapelist as 20051107 Monthly50 reuse. How do I solve this problem? How do I tell amanda to look for Monthly50 instead of Monthly04? Or how do i have amanda use the tape I give it as long as it's not recently used? That's the Or New Tape part. If you've labelled a tape but not used it, it is considered a new tape. Amanda will also accept any tape that is more than tapecycle tapes old, so in my case my tapecycle is 16 tapes, but I actually use 20 in rotation, so that if one fails, I don't have to replace it immediately. Matt
amdump reports tape usage over 100%
Dear List, I have recently stepped into an admin role where amanda is already set up. I have a lot of questions, but I'll try to limit myself. I'm running Amanda 2.5.0p2 on a Debian system. With an 8-tape changer. 1) How do I find out if hardware compression is turned on? 2) Why does the amreport show 100% in the Usage by Tape section? USAGE BY TAPE: Label Time Size %NbNc Monthly47 4:44 43875264k 108.844 1088 3) What do the levels mean? I think 0 means full backup, but what are the other numbers? That's a good start. Thanks, Steve
labelling of reused tapes
I have a monthly backup config with runtapes=3 and 59 tapes in the tapecycle. The instructions I was originally given say to run a script that looks in tapelist, find the highest # then label the next 3 tapes starting from there. This seems wrong because then it wouldn't be a cycle. My question is when I do amcheck, it tells me (expecting tape Monthly04 or a new tape). But the next tape I have in my hand is currently labeled Monthly50 which is correct because it is in the tapelist as 20051107 Monthly50 reuse. How do I solve this problem? How do I tell amanda to look for Monthly50 instead of Monthly04? Or how do i have amanda use the tape I give it as long as it's not recently used? Thanks Steve