Re: no backups since a week ago (data write: Connection reset by peer)

2007-05-11 Thread Steven Settlemyre
Thanks for all the help... I upgraded my amanda-server version to 2.5.2 
and that seems to have fixed the problem.


Jean-Louis Martineau wrote:

Steven Settlemyre wrote:

From amdump log:

This similar flow happens every time it fails. At first I was worried 
about the not enough diskspace message, but this is just because my 
holding disk filled up. I see the same thing happening in successful 
runs.

In some case, this error is fatal, this bug is fixed in newer release.


driver: state time 8459.717 free kps: 37761 space: 55584 taper: idle 
idle-dumpers: 7 qlen tapeq: 0 runq: 18 roomq: 2 wakeup: 0 d

river-idle: no-diskspace
driver: interface-state time 8459.717 if : free 37761
driver: hdisk-state time 8459.717 hdisk 0: free 55584 dumpers 1
driver: result time 8459.727 from chunker3: RQ-MORE-DISK 03-4
find diskspace: not enough diskspace. Left with 508960 K
find diskspace: not enough diskspace. Left with 71872 K
find diskspace: not enough diskspace. Left with 17664 K
find diskspace: not enough diskspace. Left with 508960 K
driver: Don't know how to send ABORT command to chunker
taper: DONE [idle wait: 7182.514 secs]
chunker: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7311.263: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7311.263: pid 1129 finish time Thu May 10 02:06:02 2007
chunker: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.517: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.517: pid 31879 finish time Thu May 10 02:06:02 2007
chunker: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.485: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.485: pid 31880 finish time Thu May 10 02:06:02 2007
taper: writing end marker. [VOL14 OK kb 2300352 fm 28]
dumper: kill index command
dumper: kill index command
dumper: kill index command
amdump: end at Thu May 10 02:06:02 EDT 2007
Scanning /holding/amanda...
 20070509234502: found Amanda directory.






Re: no backups since a week ago (data write: Connection reset by peer)

2007-05-10 Thread Steven Settlemyre

From amdump log:

This similar flow happens every time it fails. At first I was worried 
about the not enough diskspace message, but this is just because my 
holding disk filled up. I see the same thing happening in successful runs.


driver: state time 8459.717 free kps: 37761 space: 55584 taper: idle 
idle-dumpers: 7 qlen tapeq: 0 runq: 18 roomq: 2 wakeup: 0 d

river-idle: no-diskspace
driver: interface-state time 8459.717 if : free 37761
driver: hdisk-state time 8459.717 hdisk 0: free 55584 dumpers 1
driver: result time 8459.727 from chunker3: RQ-MORE-DISK 03-4
find diskspace: not enough diskspace. Left with 508960 K
find diskspace: not enough diskspace. Left with 71872 K
find diskspace: not enough diskspace. Left with 17664 K
find diskspace: not enough diskspace. Left with 508960 K
driver: Don't know how to send ABORT command to chunker
taper: DONE [idle wait: 7182.514 secs]
chunker: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7311.263: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7311.263: pid 1129 finish time Thu May 10 02:06:02 2007
chunker: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.517: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.517: pid 31879 finish time Thu May 10 02:06:02 2007
chunker: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.485: error [bad command after RQ-MORE-DISK: QUIT]
chunker: time 7888.485: pid 31880 finish time Thu May 10 02:06:02 2007
taper: writing end marker. [VOL14 OK kb 2300352 fm 28]
dumper: kill index command
dumper: kill index command
dumper: kill index command
amdump: end at Thu May 10 02:06:02 EDT 2007
Scanning /holding/amanda...
 20070509234502: found Amanda directory.


Gene Heskett wrote:

On Wednesday 09 May 2007, Steven Settlemyre wrote:
  

Can someone please help me?

Steven Settlemyre wrote:


I haven't changed my configs for months and things were running great
until last week. Since last tues, none of my dailies have finished,
and last night a monthly failed.

Looking through the logs I see the problem always seems to start with
data write: Connection reset by peer and Don't know how to send
ABORT command to chunker. I'm having a hard time interpreting the
logs and can't seem to find too much in the archives about this. Was
wondering if someone could walk me through an explanation of the
problem and how to avoid it in the future.

My monthlies run tape spanning on 3 40G tapes.

Here is the email output generated:

*** THE DUMPS DID NOT FINISH PROPERLY!

These dumps were to tape Monthly21.
The next 3 tapes Amanda expects to use are: Monthly01, Monthly02,
Monthly03.
The next 3 new tapes already labelled are: Monthly19, Monthly20,
Monthly22.

FAILURE AND STRANGE DUMP SUMMARY:
 wagstaff  /usr/locallev 1  FAILED [data
write: Connection reset by peer]
 lollipop  /files1   lev 0  FAILED [data
write: Connection reset by peer]
 helios  /files3   lev 1  FAILED [data write:
Connection reset by peer]
 helios  / RESULTS MISSING
 helios  /files2   RESULTS MISSING
 helios  /usr  RESULTS MISSING
 helios  /usr/localRESULTS MISSING
 helios  /var  RESULTS MISSING
 lollipop  / RESULTS MISSING
 lollipop  /usr  RESULTS MISSING
 lollipop  /usr/localRESULTS MISSING
 wagstaff  /files3   RESULTS MISSING
 wagstaff  /files4   RESULTS MISSING
 wagstaff  /files5   RESULTS MISSING
 wagstaff  /files6/vol/Voiceware RESULTS MISSING
 wizard/files2   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/TermLab   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/bcl   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/biochem   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/confocal  RESULTS MISSING
 driver: FATAL Don't know how to send ABORT command to chunker
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]


STATISTICS:
 Total   Full  Incr.
         
Estimate Time (hrs:min)0:08
Run Time (hrs:min) 1:01
Dump Time (hrs:min)1:55   1:40   0:16
Output Size (meg)8519.7 7729.7  790.1
Original Size (meg) 13146.311595.5 1550.8
Avg Compressed Size (%)64.8   66.7   50.9   (level:#disks
...)
Filesystems Dumped   35 12 23   (1:23)
Avg Dump Rate (k/s)  1261.0 1323.3  863.1

Tape Time (hrs:min)0:53   0:44   0:09
Tape Size (meg)  8521.6 7730.3  791.3
Tape Used

amanda-client version

2007-05-10 Thread Steven Settlemyre
How do I find out which version of amanda-client is running on my hosts? 
I found the debian ones through the package manager, but wonder if 
there's a command-line switch to use?


Steve


Please help... no backups since a week ago (data write: Connection reset by peer)

2007-05-09 Thread Steven Settlemyre

Can someone please help me?

Steven Settlemyre wrote:
I haven't changed my configs for months and things were running great 
until last week. Since last tues, none of my dailies have finished, 
and last night a monthly failed.


Looking through the logs I see the problem always seems to start with 
data write: Connection reset by peer and Don't know how to send 
ABORT command to chunker. I'm having a hard time interpreting the 
logs and can't seem to find too much in the archives about this. Was 
wondering if someone could walk me through an explanation of the 
problem and how to avoid it in the future.


My monthlies run tape spanning on 3 40G tapes.

Here is the email output generated:

*** THE DUMPS DID NOT FINISH PROPERLY!

These dumps were to tape Monthly21.
The next 3 tapes Amanda expects to use are: Monthly01, Monthly02, 
Monthly03.
The next 3 new tapes already labelled are: Monthly19, Monthly20, 
Monthly22.


FAILURE AND STRANGE DUMP SUMMARY:
 wagstaff  /usr/locallev 1  FAILED [data 
write: Connection reset by peer]
 lollipop  /files1   lev 0  FAILED [data 
write: Connection reset by peer]
 helios  /files3   lev 1  FAILED [data write: 
Connection reset by peer]

 helios  / RESULTS MISSING
 helios  /files2   RESULTS MISSING
 helios  /usr  RESULTS MISSING
 helios  /usr/localRESULTS MISSING
 helios  /var  RESULTS MISSING
 lollipop  / RESULTS MISSING
 lollipop  /usr  RESULTS MISSING
 lollipop  /usr/localRESULTS MISSING
 wagstaff  /files3   RESULTS MISSING
 wagstaff  /files4   RESULTS MISSING
 wagstaff  /files5   RESULTS MISSING
 wagstaff  /files6/vol/Voiceware RESULTS MISSING
 wizard/files2   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/TermLab   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/bcl   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/biochem   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/confocal  RESULTS MISSING
 driver: FATAL Don't know how to send ABORT command to chunker
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]


STATISTICS:
 Total   Full  Incr.
         
Estimate Time (hrs:min)0:08
Run Time (hrs:min) 1:01
Dump Time (hrs:min)1:55   1:40   0:16
Output Size (meg)8519.7 7729.7  790.1
Original Size (meg) 13146.311595.5 1550.8
Avg Compressed Size (%)64.8   66.7   50.9   (level:#disks 
...)

Filesystems Dumped   35 12 23   (1:23)
Avg Dump Rate (k/s)  1261.0 1323.3  863.1

Tape Time (hrs:min)0:53   0:44   0:09
Tape Size (meg)  8521.6 7730.3  791.3
Tape Used (%)  21.1   19.02.1   (level:#disks 
...)

Filesystems Taped35 12 23   (1:23)
  (level:#chunks ...)
Chunks Taped 35 12 23   (1:23)
Avg Tp Write Rate (k/s)  2724.3 3000.8 1433.6

USAGE BY TAPE:
 Label   Time  Size  %NbNc
 Monthly21   0:53  8726112k   21.13535


FAILED AND STRANGE DUMP DETAILS:

/--  wagstaff /usr/local lev 1 FAILED [data write: Connection reset by 
peer]

sendbackup: start [wagstaff level 1]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc 
|/usr/sbin/ufsrestore -f... -

sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 1 dump: Tue May 08 01:11:26 2007
|   DUMP: Date of last level 0 dump: Mon Apr 30 23:54:14 2007
|   DUMP: Dumping /dev/rdsk/c0t0d0s7 (wagstaff:/usr/local) to standard 
output.

|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 13585968 blocks (6633.77MB) on 0.10 tapes.
|   DUMP: Dumping (Pass III) [directories]
|   DUMP: Dumping (Pass IV) [regular files]
|   DUMP: 16.49% done, finished in 0:50
|   DUMP: 28.34% done, finished in 0:57
|   DUMP: 38.89% done, finished in 1:12
\

/--  lollipop /files1 lev 0 FAILED [data write: Connection reset by peer]
sendbackup: start [lollipop level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/sbin/ufsrestore 
-f... -

sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 0 dump: Mon May 07 23:52:56 2007

dumps failing recently (data write: Connection reset by peer)

2007-05-08 Thread Steven Settlemyre
I haven't changed my configs for months and things were running great 
until last week. Since last tues, none of my dailies have finished, and 
last night a monthly failed.


Looking through the logs I see the problem always seems to start with 
data write: Connection reset by peer and Don't know how to send ABORT 
command to chunker. I'm having a hard time interpreting the logs and 
can't seem to find too much in the archives about this. Was wondering if 
someone could walk me through an explanation of the problem and how to 
avoid it in the future.


My monthlies run tape spanning on 3 40G tapes.

Here is the email output generated:

*** THE DUMPS DID NOT FINISH PROPERLY!

These dumps were to tape Monthly21.
The next 3 tapes Amanda expects to use are: Monthly01, Monthly02, Monthly03.
The next 3 new tapes already labelled are: Monthly19, Monthly20, Monthly22.

FAILURE AND STRANGE DUMP SUMMARY:
 wagstaff  /usr/locallev 1  FAILED [data write: 
Connection reset by peer]
 lollipop  /files1   lev 0  FAILED [data write: 
Connection reset by peer]
 helios  /files3   lev 1  FAILED [data write: 
Connection reset by peer]

 helios  / RESULTS MISSING
 helios  /files2   RESULTS MISSING
 helios  /usr  RESULTS MISSING
 helios  /usr/localRESULTS MISSING
 helios  /var  RESULTS MISSING
 lollipop  / RESULTS MISSING
 lollipop  /usr  RESULTS MISSING
 lollipop  /usr/localRESULTS MISSING
 wagstaff  /files3   RESULTS MISSING
 wagstaff  /files4   RESULTS MISSING
 wagstaff  /files5   RESULTS MISSING
 wagstaff  /files6/vol/Voiceware RESULTS MISSING
 wizard/files2   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/TermLab   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/bcl   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/biochem   RESULTS MISSING
 snapserver  /hd/vol_mnt0/shares/confocal  RESULTS MISSING
 driver: FATAL Don't know how to send ABORT command to chunker
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]


STATISTICS:
 Total   Full  Incr.
         
Estimate Time (hrs:min)0:08
Run Time (hrs:min) 1:01
Dump Time (hrs:min)1:55   1:40   0:16
Output Size (meg)8519.7 7729.7  790.1
Original Size (meg) 13146.311595.5 1550.8
Avg Compressed Size (%)64.8   66.7   50.9   (level:#disks ...)
Filesystems Dumped   35 12 23   (1:23)
Avg Dump Rate (k/s)  1261.0 1323.3  863.1

Tape Time (hrs:min)0:53   0:44   0:09
Tape Size (meg)  8521.6 7730.3  791.3
Tape Used (%)  21.1   19.02.1   (level:#disks ...)
Filesystems Taped35 12 23   (1:23)
  (level:#chunks ...)
Chunks Taped 35 12 23   (1:23)
Avg Tp Write Rate (k/s)  2724.3 3000.8 1433.6

USAGE BY TAPE:
 Label   Time  Size  %NbNc
 Monthly21   0:53  8726112k   21.13535


FAILED AND STRANGE DUMP DETAILS:

/--  wagstaff /usr/local lev 1 FAILED [data write: Connection reset by peer]
sendbackup: start [wagstaff level 1]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc 
|/usr/sbin/ufsrestore -f... -

sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 1 dump: Tue May 08 01:11:26 2007
|   DUMP: Date of last level 0 dump: Mon Apr 30 23:54:14 2007
|   DUMP: Dumping /dev/rdsk/c0t0d0s7 (wagstaff:/usr/local) to standard 
output.

|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 13585968 blocks (6633.77MB) on 0.10 tapes.
|   DUMP: Dumping (Pass III) [directories]
|   DUMP: Dumping (Pass IV) [regular files]
|   DUMP: 16.49% done, finished in 0:50
|   DUMP: 28.34% done, finished in 0:57
|   DUMP: 38.89% done, finished in 1:12
\

/--  lollipop /files1 lev 0 FAILED [data write: Connection reset by peer]
sendbackup: start [lollipop level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/sbin/ufsrestore -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 0 dump: Mon May 07 23:52:56 2007
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping 

Re: amrestore help

2007-04-30 Thread Steven Settlemyre
I'd love to, but see my message below. I get an error when I try to run 
amrecover. Is there a way to get past this error?


Guy Dallaire wrote:

amrecover is easier to use

2007/4/27, Steven Settlemyre [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED]:


actually, i found how to put the desired tape into the active slot of
the changer. But when i do amrestore /dev/nst0 lollipop /files1
I get
this:

amrestore: missing file header block
amrestore:   2: skipping zoo._.20070419.1.1
amrestore: missing file header block
amrestore:   6: skipping marlin._var.20070419.1.1
amrestore:  10: reached end of information

What now?

Steven Settlemyre wrote:
 I am trying to restore a folder from a tape in a previous cycle
(last
 night had a level 0 of the disk, but the folder was deleted
prior). I
 have a tape changer and read that amrestore doesnt work with tape
 changers. My dumptypes are gnutar, so I probably can't use
amrecover.
 Is this correct? When I tried to run it, it said amrecover:
 Unexpected end of file, check amindexd*debug on server localhost.

 I've looked all over and can't seem to find what I need (possibly
 because I've never recovered before and don't know what to look
for).

 Steve





which report do I believe?

2007-04-30 Thread Steven Settlemyre
I am looking to restore a disk from one of my machines and am having a 
little trouble understanding which report to believe. Using amadmin 
find, I see there was a level 0 on 4/12 and again on 4/19.


When I look at the daily reports I see on 4/12:

pop:/files1.0 in-memory
 taper: no split_diskbuffer specified: using fallback split size of 10240kb to buffer 


and

HOSTNAME DISKLORIG-kB OUT-kB  COMP%  MMM:SS   KB/s MMM:SS   
KB/s
-- --- 
-
pop /files1 0   170241929961504   58.5  178:38  926.5 178:39  
929.4


Then on 4/19 i see:

NOTES:
 planner: Incremental of pop:/files1 bumped to level 3.

and

HOSTNAME DISKLORIG-kB OUT-kB  COMP%  MMM:SS   KB/s MMM:SS   
KB/s
-- --- 
-
pop /files1 0   170891209965664   58.3  148:35 1117.9  30:39 5418.9

So is 4/19 a level 3 or level 0?

Also, in the amadmin find, I see that 4/12 has 973 parts, whereas 4/19 
only has 2 parts.


Why the big difference? What could cause such things? and what steps 
should i take to restore this disk?


Re: which report do I believe?

2007-04-30 Thread Steven Settlemyre
I did not change any configuration between the 2 runs. I do not have 
split_diskbuffer set at all.


Jon LaBadie wrote:

Expanding on JLM's accurate comment.

On Mon, Apr 30, 2007 at 11:07:45AM -0400, Steven Settlemyre wrote:
  
I am looking to restore a disk from one of my machines and am having a 
little trouble understanding which report to believe. Using amadmin 
find, I see there was a level 0 on 4/12 and again on 4/19.


When I look at the daily reports I see on 4/12:

pop:/files1.0 in-memory
 taper: no split_diskbuffer specified: using fallback split size of 10240kb 
 to buffer 



I'm guessing you did specify a split_diskbuffer later on, before 4/19.
Since no split_diskbuffer was specified, tiny pieces of 10MB were used
resulting in 973 pieces.

  

and

HOSTNAME DISKLORIG-kB OUT-kB  COMP%  MMM:SS   KB/s 
MMM:SS   KB/s
-- --- 
-
pop 	/files1 	0   170241929961504   58.5  178:38  926.5 178:39 
929.4



Then on 4/19 i see:

NOTES:
 planner: Incremental of pop:/files1 bumped to level 3.




Planning goes through many steps, one is to decide whether to bump
to a higher level if an incremental is done.  It was planning to
do so.  But at a later step, other considerations superceded this
because:

  

and

HOSTNAME DISKLORIG-kB OUT-kB  COMP%  MMM:SS   KB/s 
MMM:SS   KB/s
-- --- 
-

pop /files1 0   170891209965664   58.3  148:35 1117.9  30:39 5418.9

So is 4/19 a level 3 or level 0?



As clearly shown, a level 0 was done.  Would you really expect your level 3
incremental, not even a level 1 or 2, to be the same size as you 4/12 level 0?

  
Also, in the amadmin find, I see that 4/12 has 973 parts, whereas 4/19 
only has 2 parts.


Why the big difference? What could cause such things? and what steps 
should i take to restore this disk?





Two pieces or 1000 would be no different.


  


amrestore help

2007-04-27 Thread Steven Settlemyre
I am trying to restore a folder from a tape in a previous cycle (last 
night had a level 0 of the disk, but the folder was deleted prior). I 
have a tape changer and read that amrestore doesnt work with tape 
changers. My dumptypes are gnutar, so I probably can't use amrecover. Is 
this correct? When I tried to run it, it said amrecover: Unexpected end 
of file, check amindexd*debug on server localhost.


I've looked all over and can't seem to find what I need (possibly 
because I've never recovered before and don't know what to look for).


Steve



Re: amrestore help

2007-04-27 Thread Steven Settlemyre
actually, i found how to put the desired tape into the active slot of 
the changer. But when i do amrestore /dev/nst0 lollipop /files1 I get 
this:


amrestore: missing file header block
amrestore:   2: skipping zoo._.20070419.1.1
amrestore: missing file header block
amrestore:   6: skipping marlin._var.20070419.1.1
amrestore:  10: reached end of information

What now?

Steven Settlemyre wrote:
I am trying to restore a folder from a tape in a previous cycle (last 
night had a level 0 of the disk, but the folder was deleted prior). I 
have a tape changer and read that amrestore doesnt work with tape 
changers. My dumptypes are gnutar, so I probably can't use amrecover. 
Is this correct? When I tried to run it, it said amrecover: 
Unexpected end of file, check amindexd*debug on server localhost.


I've looked all over and can't seem to find what I need (possibly 
because I've never recovered before and don't know what to look for).


Steve



results missing

2006-10-27 Thread Steven Settlemyre
I finally got past the gnutar problem I was having. I ran amcheck and 
everything looked good, but amdump gives me a bunch of results missing 
errors (even for hosts that worked fine before). Attached is the output 
from amdump.


Thanks,
Steve


*** THE DUMPS DID NOT FINISH PROPERLY!

These dumps were to tape VOL24.
The next tape Amanda expects to use is: VOL01.

FAILURE AND STRANGE DUMP SUMMARY:
 helios.medsci.domain  /var lev 2  STRANGE
 snapserver.medsci.domain  /hd/vol_mnt0/shares/biochem  lev 1  FAILED 
[data write: Broken pipe]
 wagstaff.asel.domain  /usr/local   lev 1  FAILED 
[data write: Connection reset by peer]

 helios.medsci.domain  /usr/local   RESULTS MISSING
 lollipop.asel.domain  /files1  RESULTS MISSING
 wagstaff.asel.domain  /RESULTS MISSING
 wagstaff.asel.domain  /files1  RESULTS MISSING
 wagstaff.asel.domain  /files3  RESULTS MISSING
 wagstaff.asel.domain  /files4  RESULTS MISSING
 wagstaff.asel.domain  /files5  RESULTS MISSING
 wagstaff.asel.domain  /files6/vol/SynthRESULTS MISSING
 wagstaff.asel.domain  /files6/vol/VoicewareRESULTS MISSING
 wagstaff.asel.domain  /files6/vol/bvd  RESULTS MISSING
 wagstaff.asel.domain  /files6/vol/spdata1  RESULTS MISSING
 wagstaff.asel.domain  /files6/vol/spdata2  RESULTS MISSING
 wagstaff.asel.domain  /files6/vol/speech7  RESULTS MISSING
 wagstaff.asel.domain  /usr RESULTS MISSING
 wizard.asel.domain/var/mailRESULTS MISSING
 wizard.asel.domain/files2  RESULTS MISSING
 snapserver.medsci.domain  /hd/vol_mnt0/shares/bcl  RESULTS MISSING
 driver: FATAL Don't know how to send ABORT command to chunker
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]
 chunker: FATAL error [bad command after RQ-MORE-DISK: QUIT]


STATISTICS:
 Total   Full  Incr.
         
Estimate Time (hrs:min)0:10
Run Time (hrs:min) 0:14
Dump Time (hrs:min)0:14   0:00   0:14
Output Size (meg) 765.70.0  765.7
Original Size (meg)  1948.10.0 1948.1
Avg Compressed Size (%)39.3--39.3   (level:#disks ...)
Filesystems Dumped   36  0 36   (1:35 2:1)
Avg Dump Rate (k/s)   966.6--   966.6

Tape Time (hrs:min)0:04   0:00   0:04
Tape Size (meg)   767.70.0  767.7
Tape Used (%)   2.10.02.1   (level:#disks ...)
Filesystems Taped36  0 36   (1:35 2:1)
  (level:#chunks ...)
Chunks Taped 36  0 36   (1:35 2:1)
Avg Tp Write Rate (k/s)  3038.4--  3038.4

USAGE BY TAPE:
 Label   Time  Size  %NbNc
 VOL24   0:04   786080k2.13636


FAILED AND STRANGE DUMP DETAILS:

/--  helios.medsci.domain /var lev 2 STRANGE
sendbackup: start [helios.medsci.domain:/var level 2]
sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: 
./lib/amanda/asel/index/professor.asel.domain/_usr_local/20061026_1.gz.tmp: 
Warning: Cannot stat: No such file or directory
? gtar: 
./lib/amanda/asel/index/snapserver.medsci.domain/_hd_vol__mnt0_shares_NeuroGen/20061026_1.gz.tmp: 
Warning: Cannot stat: No such file or directory

| gtar: ./run/postgresql/.s.PGSQL.5432: socket ignored
| Total bytes written: 1115033600 (1.1GiB, 6.5MiB/s)
sendbackup: size 1088900
sendbackup: end
\

/--  snapserver.medsci.domain /hd/vol_mnt0/shares/biochem lev 1 FAILED 
[data write: Broken pipe]
sendbackup: start [snapserver.medsci.domain:/hd/vol_mnt0/shares/biochem 
level 1]

sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
\

/--  wagstaff.asel.domain /usr/local lev 1 FAILED [data write: 
Connection reset by peer]

sendbackup: start [wagstaff.asel.domain:/usr/local level 1]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc 
|/usr/sbin/ufsrestore -f... -

sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 1 dump: Thu Oct 26 23:55:10 2006
|   DUMP: Date of last level 0 dump: Mon Oct 23 23:55:43 2006
|   DUMP: Dumping /dev/rdsk/c0t0d0s7 (wagstaff:/usr/local) to standard 
output.

|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 11706368 

question about new disks

2006-10-20 Thread Steven Settlemyre
I recently broke a disk up into subfolders to (hopefully) allow amanda 
to do a better job fitting things onto daily single-tape backups. But I 
keep getting the following errors.


HOSTNAME /files6/vol/speech7lev 0  FAILED [disk /files6/vol/speech7, all 
estimate failed]
HOSTNAME /files6/vol/spdata2lev 0  FAILED [disk /files6/vol/spdata2, all 
estimate failed]
HOSTNAME /files6/vol/spdata1lev 0  FAILED [disk /files6/vol/spdata1, all 
estimate failed]
HOSTNAME /files6/vol/bvdlev 0  FAILED [disk /files6/vol/bvd, all 
estimate failed]
HOSTNAME /files6/vol/Voiceware  lev 0  FAILED [disk /files6/vol/Voiceware, all 
estimate failed]
HOSTNAME /files6/vol/Synth  lev 0  FAILED [disk /files6/vol/Synth, all 
estimate failed]


Is there something else I had to do besides change the disklist file. 
BTW, the dumptype is


define dumptype asel-gnutar {
   comment ASEL backups using gnutar
   options compress-fast
   program GNUTAR
   tape_splitsize 5 Gb
   exclude list /etc/backup-excludes.txt
   index yes
}


Thanks,
Steve


Also (unrelated), I have sent a few emails to the list that never seem 
to make it out. Is this a known issue?


Re: question about new disks

2006-10-20 Thread Steven Settlemyre

Server:
Debian (sarge) Linux.
amanda version: 2.5.0p2

Client:
tar: (GNU tar) 1.13.19
amanda-client: ? how do i find this?

Paul Bijnens wrote:

On 2006-10-20 15:05, Steven Settlemyre wrote:
  

I recently broke a disk up into subfolders to (hopefully) allow amanda
to do a better job fitting things onto daily single-tape backups. But I
keep getting the following errors.

HOSTNAME /files6/vol/speech7lev 0  FAILED [disk /files6/vol/speech7,
all estimate failed]


[...]


Any useful error message in one of the debug files:
have a look in:
   amdump.1  (in directory ~amanda/TheConfig)
   planner.datetime.debug (only 2.5.1+; in /tmp/amanda probably)

What version of Amanda do have have on client/server.
And version of gnutar on client?


  


Re: question about new disks

2006-10-20 Thread Steven Settlemyre
someone hinted to me that since my client is a solaris machine, the 
GNUTAR program specified in my dumptype might not be correct. I know on 
my Linux machines, gnutar is /bin/tar, but my solaris machine is at 
/usr/local/bin/tar. could this be the problem? also, how would I get my 
dumptype to use this path?


Thanks

Jon LaBadie wrote:

On Fri, Oct 20, 2006 at 10:15:04AM -0400, Steven Settlemyre wrote:
  

Server:
Debian (sarge) Linux.
amanda version: 2.5.0p2

Client:
tar: (GNU tar) 1.13.19
amanda-client: ? how do i find this?




On any amanda installation,

amadmin - version


For amanda developers, an RFE, amadmin syntax is:
amadmin config cmd cmd_args

So for the version cmd you have to supply a config argument.
Thus my - in the above command line.
It would make sense (to me) to allow the version cmd as an
exception to the general syntax and allow amadmin version
(i.e. with or without a config named).  Config doesn't make
much sense for the version cmd.  Nor on clients prior to
recent releases.

  


Re: question about new disks

2006-10-20 Thread Steven Settlemyre
actually, i couldnt find amadmin anywhere on my client. I did just come 
across the fact that /bin/tar is where GNUTAR is looking. I now see it 
by using your command (on the amanda server), but i found it by doing 
strings /usr/lib/amanda/runtar.


I guess the easiest way for me to fix this problem is to make a symlink 
from the /usr/local/bin/tar to /bin/tar (of course I'll backup the 
existing tar first). With a little foresight, the best way to handle 
this is to configure amanda for gnutar somewhere in the amanda dir and 
create a symlink on each client machine?


Thanks again.
Steve

Jon LaBadie wrote:

On Fri, Oct 20, 2006 at 12:07:03PM -0400, Steven Settlemyre wrote:
  
someone hinted to me that since my client is a solaris machine, the 
GNUTAR program specified in my dumptype might not be correct. I know on 
my Linux machines, gnutar is /bin/tar, but my solaris machine is at 
/usr/local/bin/tar. could this be the problem? also, how would I get my 
dumptype to use this path?


Thanks

Jon LaBadie wrote:


On Fri, Oct 20, 2006 at 10:15:04AM -0400, Steven Settlemyre wrote:
 
  

Server:
Debian (sarge) Linux.
amanda version: 2.5.0p2

Client:
tar: (GNU tar) 1.13.19
amanda-client: ? how do i find this?

   


On any amanda installation,

amadmin - version

  


You thanked me but probably did not run my suggestion.
Lots of other things print out with the version cmd
including the path to gnutar that will be used.

It is hardcoded at compile time, not a runtime setting.

For that reason I always run configure with the option
to specify where gnutar lives and say it lives in
/usr/local/libexec/amgtar.  Then I make amgtar be a
copy or a link of whichever tar I wish to use.

  


new disks failed

2006-10-19 Thread Steven Settlemyre
I just added a few new disks to my backup (actually just granulated the 
/files6 disk), and each of them failed with the following error.


HOSTNAME  /files6/vol/speech7   lev 0  FAILED [disk 
/files6/vol/speech7, all estimate failed]
HOSTNAME  /files6/vol/spdata2   lev 0  FAILED [disk 
/files6/vol/spdata2, all estimate failed]
HOSTNAME  /files6/vol/spdata1   lev 0  FAILED [disk 
/files6/vol/spdata1, all estimate failed]
HOSTNAME  /files6/vol/bvd   lev 0  FAILED [disk 
/files6/vol/bvd, all estimate failed]
HOSTNAME  /files6/vol/Voiceware lev 0  FAILED [disk 
/files6/vol/Voiceware, all estimate failed]
HOSTNAME  /files6/vol/Synth lev 0  FAILED [disk 
/files6/vol/Synth, all estimate failed]
HOSTNAME  /hd/vol_mnt0/shares/sysadmin  lev 1  FAILED [cannot read header: 
got 0 instead of 32768]


Is there something special I have to do to add new disks?




Looooooooong backup

2006-10-18 Thread Steven Settlemyre
I have a monthly (full) backup running for about 22 hrs now. Do you 
think there is a problem, or is it possible it's just taking a long 
time? about 150G of data.


Steve


Re: Looooooooong backup

2006-10-18 Thread Steven Settlemyre

looks ok, but i'm not too experienced reading these

See attached.


Jon LaBadie wrote:

On Wed, Oct 18, 2006 at 01:14:57PM -0400, Steven Settlemyre wrote:
  
I have a monthly (full) backup running for about 22 hrs now. Do you 
think there is a problem, or is it possible it's just taking a long 
time? about 150G of data.



Have you tried a few amstatus cmds to see if it is still progressing?

  
HOSTNAME:/1  210k finished (13:43:44)
HOSTNAME:/files2  1  163k finished (13:43:32)
HOSTNAME:/files3  1 2069k finished (13:44:05)
HOSTNAME:/usr 0   633514k finished (13:58:13)
HOSTNAME:/usr/local   1  343k finished (13:43:22)
HOSTNAME:/var 1   607265k finished (13:55:33)
HOSTNAME:/029748k finished (16:11:36)
HOSTNAME:/files1  0  8356982k finished (16:11:24)
HOSTNAME:/usr 12k finished (16:11:39)
HOSTNAME:/usr/local   16k finished (16:11:38)
HOSTNAME:/1   43k finished (14:01:35)
HOSTNAME:/files1  1  388k finished (14:01:30)
HOSTNAME:/usr 1 8231k finished (14:01:26)
HOSTNAME:/var 0   902479k finished (13:52:48)
HOSTNAME:/1  130k finished (13:43:34)
HOSTNAME:/usr 1  581k finished (13:43:47)
HOSTNAME:/usr/local   1  143k finished (13:43:23)
HOSTNAME:/0   no estimate
HOSTNAME:/hd/vol_mnt0/shares/BioInf   0 16717144k dumping to tape (12:49:32)
HOSTNAME:/hd/vol_mnt0/shares/MID  1   48k finished (13:45:28)
HOSTNAME:/hd/vol_mnt0/shares/MolGene  01k finished (14:01:39)
HOSTNAME:/hd/vol_mnt0/shares/NeuroGen 18k finished (13:43:21)
HOSTNAME:/hd/vol_mnt0/shares/TermLab  0 19943829k finished (7:59:15)
HOSTNAME:/hd/vol_mnt0/shares/admin01k finished (13:45:25)
HOSTNAME:/hd/vol_mnt0/shares/bcl  0 11817155k finished (17:22:57)
HOSTNAME:/hd/vol_mnt0/shares/biochem  1   13k finished (13:44:01)
HOSTNAME:/hd/vol_mnt0/shares/confocal 01k finished (13:44:56)
HOSTNAME:/hd/vol_mnt0/shares/histo01k finished (13:46:13)
HOSTNAME:/hd/vol_mnt0/shares/immuno   1   44k finished (13:43:28)
HOSTNAME:/hd/vol_mnt0/shares/mcard01k finished (13:46:10)
HOSTNAME:/hd/vol_mnt0/shares/mysql01k finished (13:44:53)
HOSTNAME:/hd/vol_mnt0/shares/neurosci 1 7331k finished (13:44:43)
HOSTNAME:/hd/vol_mnt0/shares/surgery  11k finished (13:43:43)
HOSTNAME:/hd/vol_mnt0/shares/sysadmin 0 1990k finished (14:01:28)
HOSTNAME:/hd/vol_mnt0/shares/urology  123734k finished (13:46:08)
HOSTNAME:/1   13k finished (14:01:36)
HOSTNAME:/files1  0  9512632k finished (19:40:32)
HOSTNAME:/files3  1   565385k finished (14:01:07)
HOSTNAME:/files4  17k finished (14:01:37)
HOSTNAME:/files5  0 19194467k finished (12:49:32)
HOSTNAME:/files6  0 42482791k finished (6:09:01)
HOSTNAME:/usr 11k finished (14:01:40)
HOSTNAME:/usr/local   184929k finished (14:01:23)
HOSTNAME:/1   15k finished (14:21:23)
HOSTNAME:/files1  18k finished (14:21:24)
HOSTNAME:/files2  1   278233k finished (14:32:04)
HOSTNAME:/usr 11k finished (14:21:25)
HOSTNAME:/usr/local   120995k finished (14:21:21)
HOSTNAME:/var 167529k finished (14:21:02)
HOSTNAME:/var/log 067191k finished (14:21:17)
HOSTNAME:/var/mail1  1325234k finished (14:20:46)

SUMMARY  part  real  estimated
   size   size
partition   :  51
estimated   :  50125731490k
flush   :   0 0k
failed  :   10k   (  0.00%)
wait for dumping:   00k   (  0.00%)
dumping to tape :   1 16717144k   ( 13.30%)
dumping :   0 0k 0k (  0.00%) (  0.00%)
dumped  :  50 132653031k 125731490k (105.51%) (105.51%)
wait for writing:   0 0k 0k (  0.00%) (  0.00%)
wait to flush   :   0 0k 0k (100.00%) (  0.00%)
writing to tape :   0 0k 0k (  0.00%) (  0.00%)
failed to tape  :   0 0k 0k (  0.00%) (  0.00%)
taped   :  49 115935887k 109014346k (106.35

Re: Looooooooong backup

2006-10-18 Thread Steven Settlemyre
Thanks for all the responses. It finished about 10 mins after I sent the 
last message, but it did run out of tape(s)... But that's another issue.


Steffan Vigano wrote:
Sometimes on my FreeBSD box 'amstatus' output doesn't change during 
really long dumps.  I've found the best way to find out if it's still 
running is to use the 'ps' command and grep for the phrase 'done'  ie:  
 'ps -ax | grep done'


Dump does a good job of reporting back how far it thinks it is in the 
process.   ie:78% done

-S


tape order

2006-10-17 Thread Steven Settlemyre
I have 24 tapes and a 8 tape changer. For some reason, it is going 
13-16-15-14-17. How can I fix this? can i just force it to take 14 
after 13 by only having 14 in there when it's expecting 16? My tapecycle 
is 10.


Steve


duplicate data?

2006-10-17 Thread Steven Settlemyre
in my existing amanda setup, i have /usr and /usr/local in the disklist 
for the same machine. They are separate disks (or partitions at least). 
Is this correct, or does /usr include /usr/local?


Re: tape order

2006-10-17 Thread Steven Settlemyre
right, but since I have 24 tapes and my dumpcycle is 1 week, I should 
never have to go that far back as to need these tapes.


Jon LaBadie wrote:

On Tue, Oct 17, 2006 at 11:44:53AM -0400, Steven Settlemyre wrote:
  
I have 24 tapes and a 8 tape changer. For some reason, it is going 
13-16-15-14-17. How can I fix this? can i just force it to take 14 
after 13 by only having 14 in there when it's expecting 16? My tapecycle 
is 10.



As Michael suggests, your human ordering is VERY likely
to get out of sequence.  Best to forget about it.

If you insist, amanda looks for exact match for the sequence
amanda has been using, new tapes (labeled but never before 
by amanda since labeling), and tapes that are not one of

the last tapecycle (minus 1 I think) number of tapes used.

So, if you want it to use tape 14, and it has not been used
in the last tapecycle number of tapes, make sure that no
other tapes in the changer fit the three categories above.

Recognize too that there are probably some level 0 DLE dumps
on tape 14 and will be overwritten early.  The recovery value
of the incrementals on tape 17, etc., that are based on those
level 0's is reduced.

  


Re: labelling of reused tapes

2006-10-11 Thread Steven Settlemyre
None of my tapes are new. I put the next 3 tapes that I have in the 
changer and run amcheck. It seems to be ok with 2 of the 3 tapes but 
gives me not an amanda tape (Input/output error) for slot 3. So I read 
the archives and they say to label the tape, so I run amlabel:


BASH/usr/sbin/amlabel -f monthly Monthly53 slot 3
changer: got exit: 0 str: 3 /dev/nst0
labeling tape in slot 3 (/dev/nst0):
rewinding
amlabel: tape_rewind: rewinding tape: /dev/nst0: Input/output error
amlabel: tape_rewind: rewinding tape: /dev/nst0: Input/output error
amlabel: pid 17508 finish time Wed Oct 11 09:40:36 2006

Is this a tape problem or am I approaching this incorrectly?

Steve


Matt Hyclak wrote:

On Tue, Oct 10, 2006 at 03:00:32PM -0400, Steven Settlemyre enlightened us:
  
I have a monthly backup config with runtapes=3 and 59 tapes in the 
tapecycle. The instructions I was originally given say to run a script 
that looks in tapelist, find the highest # then label the next 3 tapes 
starting from there. This seems wrong because then it wouldn't be a 
cycle. My question is when I do amcheck, it tells me (expecting tape 
Monthly04 or a new tape). But the next tape I have in my hand is 
currently labeled Monthly50 which is correct because it is in the 
tapelist as 20051107 Monthly50 reuse. How do I solve this problem? How 
do I tell amanda to look for Monthly50 instead of Monthly04? Or how 
do i have amanda use the tape I give it as long as it's not recently used?





That's the Or New Tape part. If you've labelled a tape but not used it, it
is considered a new tape. Amanda will also accept any tape that is more than
tapecycle tapes old, so in my case my tapecycle is 16 tapes, but I actually
use 20 in rotation, so that if one fails, I don't have to replace it
immediately.

Matt

  


amdump reports tape usage over 100%

2006-10-10 Thread Steven Settlemyre

Dear List,

I have recently stepped into an admin role where amanda is already set 
up. I have a lot of questions, but I'll try to limit myself. I'm running 
Amanda 2.5.0p2 on a Debian system. With an 8-tape changer.


1) How do I find out if hardware compression is turned on?

2) Why does the amreport show  100% in the Usage by Tape section?

USAGE BY TAPE:
 Label   Time  Size  %NbNc
 Monthly47   4:44 43875264k  108.844  1088


3) What do the levels mean? I think 0 means full backup, but what are 
the other numbers?


That's a good start.

Thanks,
Steve


labelling of reused tapes

2006-10-10 Thread Steven Settlemyre
I have a monthly backup config with runtapes=3 and 59 tapes in the 
tapecycle. The instructions I was originally given say to run a script 
that looks in tapelist, find the highest # then label the next 3 tapes 
starting from there. This seems wrong because then it wouldn't be a 
cycle. My question is when I do amcheck, it tells me (expecting tape 
Monthly04 or a new tape). But the next tape I have in my hand is 
currently labeled Monthly50 which is correct because it is in the 
tapelist as 20051107 Monthly50 reuse. How do I solve this problem? How 
do I tell amanda to look for Monthly50 instead of Monthly04? Or how 
do i have amanda use the tape I give it as long as it's not recently used?


Thanks
Steve