Re: lev 0 FAILED [data timeout]

2019-05-28 Thread Jon LaBadie
On Tue, May 28, 2019 at 12:57:18PM +1000, Tom Robinson wrote:
> On Tue, 14 May 2019 at 22:14, Nathan Stratton Treadway 
> wrote:
> 
> Hi Nathan,
> 
> Thanks for you reply and help.
> 
> On Mon, May 13, 2019 at 09:59:13 +1000, Tom Robinson wrote:
.
> 
> I am getting a new issue now which is annoying but not a show stopper. I
> think I will have to revisit my threshold settings to fix this but maybe
> you can offer some insight.
> 
> I have a tape robot and the following settings in amanda.conf
> 
> runtapes 3
> flush-threshold-dumped 100
> flush-threshold-scheduled 100
> taperflush 100
> autoflush yes
> 
> There is enough room for all the data to go on three tapes yet after the
> amdump run is complete only two tapes are written and I am left to flush
> the remaining dumps to tape manually.
> 
> I think it's because I'm trying to get a whole tape's worth of data before
> writing to tape. Is my thinking correct?

With "taperflush 100" you are saying do not write to tape unless
the tape will be filled.  So the third tape, being a partial, is
not written.

However you should not need to manually flush the data, "autoflush yes"
will write the leftover dumps onto the first tape of the next run.
> 
> What I'd like to do is make sure there's a tape's worth of data to write to
> the first two tapes in turn and then dump all remaining backup data to tape
> three (this will not be a complete tapes worth).
> 
> Should I be setting taperflush as follows to achieve this?
> 
> taperflush 0

Yes, from the comments in the sample "amanda.conf".

# You want to improve tape performance by waiting for a complete tape
# of data before writing anything. However, all dumps will be flushed;
# none will be left on the holding disk.
#
# flush-threshold-dumped100 # (or more)
# flush-threshold-scheduled 100 # (or more)
# taperflush0

jl
-- 
Jon H. LaBadie j...@jgcomp.com
 11226 South Shore Rd.  (703) 787-0688 (H)
 Reston, VA  20190  (703) 935-6720 (C)


Re: lev 0 FAILED [data timeout]

2019-05-27 Thread Tom Robinson
On Tue, 14 May 2019 at 22:14, Nathan Stratton Treadway 
wrote:

Hi Nathan,

Thanks for you reply and help.

On Mon, May 13, 2019 at 09:59:13 +1000, Tom Robinson wrote:
> > I have a weekly backup that backs-up the daily disk based backup to tape
> (daily's are a separate
> > amanda config).
>
> (As a side note: if you are running Amanda 3.5 you might consider using
> vaulting to do this sort of backup, so that Amanda knows about the
> copies that are put onto the tape.)
>
>
Unfortunately, no. We're on 3.3.3. Considering updating that on an Illumos
variant (OmniOS CE)... looks like I may have to compile a custom package.
Originally I installed a CSW package but I haven't seen any updates for
that as of yet.



> > Occasionally on the weekly backup a DLE will fail to dump writing only a
> 32k header file before
> > timing out.
> >
> > I can't seem to identify the error when looking in logging. Has anyone a
> few clues as to what to
> > look for?
> >
> > FAILURE DUMP SUMMARY:
> >   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [data
> timeout]
> >   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [dumper
> returned FAILED]
> >   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [data
> timeout]
> >   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [dumper
> returned FAILED]
>
> [...]
> >
> > A while ago I changed estimate to calcsize as the estimates were taking
> a very long time and all
> > daily slots are a know maximum fixed size. I thought it might help with
> time outs. Alas not.
>
> Amanda's estimate timeouts are separate from data timeouts; changing to
> calcsize will help with the former but not the latter.  (Note that if
> the slots are really a known size, using "server" estimate is even
> faster than "calcsize", since it then just uses the size from the
> previous run and doesn't read through the current contents of the DLE
> directory to add up the sizes.)
>
> You can control the data timeouts with the "dtimeout" parameter in
> amanda.conf.  Just try bumping that up so that you are sure it's longer
> than a dump actually takes.
>

My dtimeout was set to half an hour. After I check some logs and found the
average dump time for my DLEs was around 55 minutes I've adjusted the
dtimeout to 1hour. The last run went well so I'm waiting on the next run to
see if it's consistent.


> (The sendbackup..debug and runtar..debug client log
> files should confirm that the GNU tar is running without error but then
> unable to write to the output pipe on the server side.  In the server
> logs, I think the 'data timeout reached, aborting'-sort of message would
> be found in the dumper..debug file for that run...)
>

yes, that helps knowing where to look!

I am getting a new issue now which is annoying but not a show stopper. I
think I will have to revisit my threshold settings to fix this but maybe
you can offer some insight.

I have a tape robot and the following settings in amanda.conf

runtapes 3
flush-threshold-dumped 100
flush-threshold-scheduled 100
taperflush 100
autoflush yes

There is enough room for all the data to go on three tapes yet after the
amdump run is complete only two tapes are written and I am left to flush
the remaining dumps to tape manually.

I think it's because I'm trying to get a whole tape's worth of data before
writing to tape. Is my thinking correct?

What I'd like to do is make sure there's a tape's worth of data to write to
the first two tapes in turn and then dump all remaining backup data to tape
three (this will not be a complete tapes worth).

Should I be setting taperflush as follows to achieve this?

taperflush 0

Kind regards,
Tom


Re: lev 0 FAILED [data timeout]

2019-05-14 Thread Nathan Stratton Treadway
On Mon, May 13, 2019 at 09:59:13 +1000, Tom Robinson wrote:
> I have a weekly backup that backs-up the daily disk based backup to tape 
> (daily's are a separate
> amanda config).

(As a side note: if you are running Amanda 3.5 you might consider using
vaulting to do this sort of backup, so that Amanda knows about the
copies that are put onto the tape.)


 
> Occasionally on the weekly backup a DLE will fail to dump writing only a 32k 
> header file before
> timing out.
> 
> I can't seem to identify the error when looking in logging. Has anyone a few 
> clues as to what to
> look for?
> 
> FAILURE DUMP SUMMARY:
>   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [data timeout]
>   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [dumper returned 
> FAILED]
>   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [data timeout]
>   monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [dumper returned 
> FAILED]

[...]
> 
> A while ago I changed estimate to calcsize as the estimates were taking a 
> very long time and all
> daily slots are a know maximum fixed size. I thought it might help with time 
> outs. Alas not.

Amanda's estimate timeouts are separate from data timeouts; changing to
calcsize will help with the former but not the latter.  (Note that if
the slots are really a known size, using "server" estimate is even
faster than "calcsize", since it then just uses the size from the
previous run and doesn't read through the current contents of the DLE
directory to add up the sizes.)

You can control the data timeouts with the "dtimeout" parameter in
amanda.conf.  Just try bumping that up so that you are sure it's longer
than a dump actually takes.

(The sendbackup..debug and runtar..debug client log
files should confirm that the GNU tar is running without error but then
unable to write to the output pipe on the server side.  In the server
logs, I think the 'data timeout reached, aborting'-sort of message would
be found in the dumper..debug file for that run...)

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


lev 0 FAILED [data timeout]

2019-05-12 Thread Tom Robinson
Hi,

I have a weekly backup that backs-up the daily disk based backup to tape 
(daily's are a separate
amanda config).

Occasionally on the weekly backup a DLE will fail to dump writing only a 32k 
header file before
timing out.

I can't seem to identify the error when looking in logging. Has anyone a few 
clues as to what to
look for?

FAILURE DUMP SUMMARY:
  monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [data timeout]
  monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [dumper returned 
FAILED]
  monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [data timeout]
  monza /data/backup/amanda/vtapes/daily/slot9 lev 0  FAILED [dumper returned 
FAILED]

FAILED DUMP DETAILS:
  /-- monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data timeout]
  sendbackup: start [monza:/data/backup/amanda/vtapes/daily/slot9 level 0]
  sendbackup: info BACKUP=/opt/csw/bin/gtar
  sendbackup: info RECOVER_CMD=/opt/csw/bin/gtar -xpGf - ...
  sendbackup: info end
  \
  /-- monza /data/backup/amanda/vtapes/daily/slot9 lev 0 FAILED [data timeout]
  sendbackup: start [monza:/data/backup/amanda/vtapes/daily/slot9 level 0]
  sendbackup: info BACKUP=/opt/csw/bin/gtar
  sendbackup: info RECOVER_CMD=/opt/csw/bin/gtar -xpGf - ...
  sendbackup: info end

DLEs are using:

    root-tar
    strategy noinc
    estimate calcsize
    index no

A while ago I changed estimate to calcsize as the estimates were taking a very 
long time and all
daily slots are a know maximum fixed size. I thought it might help with time 
outs. Alas not.

Kind regards,
Tom




FOLLOW-UP: getting message lev 0 failed [data timeout]

2001-10-09 Thread Kurt Yoder

Kurt Yoder wrote:
 
 Hello
 
 (I tried sending this message to the list two times so far, but didn't
 see it get posted, so I'm trying again)

My apologies for reposting this message; I finally received the two
original posts from the mailing list today (10/3/01), two weeks after I
originally posted them. I Don't know if it's something wrong with our
mail setup or something wrong with the mailing list software.

 I'm getting this error whenever I try to run amdump (2.4.2p1) on freebsd
 4.2:
 
 /-- galadriel. /usr lev 0 FAILED [data timeout]
 sendbackup: start [galadriel.shcorp.com:/usr level 0]
 sendbackup: info BACKUP=/usr/local/bin/gtar
 sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -f... -
 sendbackup: info end
 ? sendbackup: index tee cannot write [Broken pipe]
 ? index returned 1
 sendbackup: error [/usr/local/bin/gtar got signal 13]
 \

snip

I got this working by doubling the etimeout setting in amanda.conf.



getting message lev 0 failed [data timeout]

2001-09-26 Thread Kurt Yoder

Hello

(I tried sending this message to the list two times so far, but didn't
see it get posted, so I'm trying again)

I'm getting this error whenever I try to run amdump (2.4.2p1) on freebsd
4.2:

/-- galadriel. /usr lev 0 FAILED [data timeout]
sendbackup: start [galadriel.shcorp.com:/usr level 0]
sendbackup: info BACKUP=/usr/local/bin/gtar
sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -f... -
sendbackup: info end
? sendbackup: index tee cannot write [Broken pipe]
? index returned 1
sendbackup: error [/usr/local/bin/gtar got signal 13]
\

All the other partitions on galadriel and the other hosts I'm backing up
are OK, so the amanda client install seems alright. I've also tried
installing the latest version of gnutar; 1.13.22 and increasing the
dtimeout setting in amanda.conf from 1800 to 3600. Is there anything
else I can try (such as upping one of the other timeout settings)?

here's /tmp/amanda/sendbackup.debug; doesn't seem to shed any light on
it (although I wonder why it says 1970):

sendbackup: debug 1 pid 10243 ruid 1000 euid 1000 start time Wed Sep 19
02:13:31 2001
/usr/local/libexec/amanda/sendbackup: got input request: GNUTAR /usr 0
1970:1:1:0:0:0 OPTIONS
|;bsd-auth;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;
  parsed request as: program `GNUTAR' disk `/usr' lev 0 since
1970:1:1:0:0:0 opt
`|;bsd-auth;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;'
  waiting for connect on 3189, then 3190, then 3191
  got all connections
sendbackup: doing level 0 dump as listed-incremental:
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
sendbackup: doing level 0 dump from date: 1970-01-01  0:00:00 GMT
sendbackup: spawning /usr/local/libexec/amanda/runtar in pipeline
sendbackup: argument list: gtar --create --directory /usr
--listed-incremental
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
--sparse --one-file-system --ignore-failed-read --totals
--file - --exclude-from /usr/local/lib/amanda/exclude.gtar .
sendbackup-gnutar: pid 10245: /usr/local/libexec/amanda/runtar --create
--directory /usr --listed-incremental
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
--sparse --one-file-system --ignore-failed-read --totals --file
-
/usr/local/lib/amanda/exclude.gtar--exclude-from/usr/local/lib/amanda/exclude.gtar
sendbackup: started index creator: /usr/local/bin/gtar -tf -
2/dev/null | sed
-e 's/^\.//'
index tee cannot write [Broken pipe]
index tee cannot write [Broken pipe]
sendbackup: pid 10244 finish time Wed Sep 19 02:28:02 2001
error [/usr/local/bin/gtar got signal 13]
error [/usr/local/bin/gtar got signal 13]
sendbackup: pid 10243 finish time Wed Sep 19 02:28:02 2001



getting message lev 0 failed [data timeout]

2001-09-24 Thread Kurt Yoder

Hello

(I tried sending this message to the list last week, but didn't see it
get posted, so I'm trying again)

I'm getting this error whenever I try to run amdump (2.4.2p1) on freebsd
4.2:

/-- galadriel. /usr lev 0 FAILED [data timeout]
sendbackup: start [galadriel.shcorp.com:/usr level 0]
sendbackup: info BACKUP=/usr/local/bin/gtar
sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -f... -
sendbackup: info end
? sendbackup: index tee cannot write [Broken pipe]
? index returned 1
sendbackup: error [/usr/local/bin/gtar got signal 13]
\

All the other partitions on galadriel and the other hosts I'm backing up
are OK, so the amanda client install seems alright. I've also tried
installing the latest version of gnutar; 1.13.22 and increasing the
dtimeout setting in amanda.conf from 1800 to 3600. Is there anything
else I can try (such as upping one of the other timeout settings)?

here's /tmp/amanda/sendbackup.debug; doesn't seem to shed any light on
it (although I wonder why it says 1970):

sendbackup: debug 1 pid 10243 ruid 1000 euid 1000 start time Wed Sep 19
02:13:31 2001
/usr/local/libexec/amanda/sendbackup: got input request: GNUTAR /usr 0
1970:1:1:0:0:0 OPTIONS
|;bsd-auth;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;
  parsed request as: program `GNUTAR' disk `/usr' lev 0 since
1970:1:1:0:0:0 opt
`|;bsd-auth;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;'
  waiting for connect on 3189, then 3190, then 3191
  got all connections
sendbackup: doing level 0 dump as listed-incremental:
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
sendbackup: doing level 0 dump from date: 1970-01-01  0:00:00 GMT
sendbackup: spawning /usr/local/libexec/amanda/runtar in pipeline
sendbackup: argument list: gtar --create --directory /usr
--listed-incremental
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
--sparse --one-file-system --ignore-failed-read --totals
--file - --exclude-from /usr/local/lib/amanda/exclude.gtar .
sendbackup-gnutar: pid 10245: /usr/local/libexec/amanda/runtar --create
--directory /usr --listed-incremental
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
--sparse --one-file-system --ignore-failed-read --totals --file
-
/usr/local/lib/amanda/exclude.gtar--exclude-from/usr/local/lib/amanda/exclude.gtar
sendbackup: started index creator: /usr/local/bin/gtar -tf -
2/dev/null | sed
-e 's/^\.//'
index tee cannot write [Broken pipe]
index tee cannot write [Broken pipe]
sendbackup: pid 10244 finish time Wed Sep 19 02:28:02 2001
error [/usr/local/bin/gtar got signal 13]
error [/usr/local/bin/gtar got signal 13]
sendbackup: pid 10243 finish time Wed Sep 19 02:28:02 2001



getting message lev 0 failed [data timeout]

2001-09-21 Thread Kurt Yoder

Hello

I'm getting this error whenever I try to run amdump (2.4.2p1) on freebsd
4.2:

/-- galadriel. /usr lev 0 FAILED [data timeout]
sendbackup: start [galadriel.shcorp.com:/usr level 0]
sendbackup: info BACKUP=/usr/local/bin/gtar
sendbackup: info RECOVER_CMD=/usr/local/bin/gtar -f... -
sendbackup: info end
? sendbackup: index tee cannot write [Broken pipe]
? index returned 1
sendbackup: error [/usr/local/bin/gtar got signal 13]
\

All the other partitions on galadriel and the other hosts I'm backing up
are OK, so the amanda client install seems alright. I've also tried
installing the latest version of gnutar; 1.13.22 and increasing the
dtimeout setting in amanda.conf from 1800 to 3600. Is there anything
else I can try (such as upping one of the other timeout settings)?

here's /tmp/amanda/sendbackup.debug; doesn't seem to shed any light on
it (although I wonder why it says 1970):

sendbackup: debug 1 pid 10243 ruid 1000 euid 1000 start time Wed Sep 19
02:13:31 2001
/usr/local/libexec/amanda/sendbackup: got input request: GNUTAR /usr 0
1970:1:1:0:0:0 OPTIONS
|;bsd-auth;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;
  parsed request as: program `GNUTAR' disk `/usr' lev 0 since
1970:1:1:0:0:0 opt
`|;bsd-auth;index;exclude-list=/usr/local/lib/amanda/exclude.gtar;'
  waiting for connect on 3189, then 3190, then 3191
  got all connections
sendbackup: doing level 0 dump as listed-incremental:
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
sendbackup: doing level 0 dump from date: 1970-01-01  0:00:00 GMT
sendbackup: spawning /usr/local/libexec/amanda/runtar in pipeline
sendbackup: argument list: gtar --create --directory /usr
--listed-incremental
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
--sparse --one-file-system --ignore-failed-read --totals
--file - --exclude-from /usr/local/lib/amanda/exclude.gtar .
sendbackup-gnutar: pid 10245: /usr/local/libexec/amanda/runtar --create
--directory /usr --listed-incremental
/usr/local/var/amanda/gnutar-lists/galadriel.shcorp.com_usr_0.new
--sparse --one-file-system --ignore-failed-read --totals --file
-
/usr/local/lib/amanda/exclude.gtar--exclude-from/usr/local/lib/amanda/exclude.gtar
sendbackup: started index creator: /usr/local/bin/gtar -tf -
2/dev/null | sed
-e 's/^\.//'
index tee cannot write [Broken pipe]
index tee cannot write [Broken pipe]
sendbackup: pid 10244 finish time Wed Sep 19 02:28:02 2001
error [/usr/local/bin/gtar got signal 13]
error [/usr/local/bin/gtar got signal 13]
sendbackup: pid 10243 finish time Wed Sep 19 02:28:02 2001



Re: amanda: lev 0 FAILED [data timeout]

2001-01-19 Thread afm

On Tue, 16 Jan 2001, Bruce Ferrell wrote:

 Looks like the network dropped out

No, that's not the case. The same behaviour happens consistantly all
the time, only with that computer (not the other 5 which are on the
same hub which can access the network fine (it's running a mail
server, httpd, ssh server).
Lately what happend was something even more interesting: I had
a dump of like 5-6 machines and when it got to the funny one (the one
running linux which doesn't behave) it started dumping (or so it sayd)
and the amanda server kept waiting and waiting. I let it sit there for
a day or so. The client had sendbackup running as a process and all
the other amanda processes as if it was backing up properly, but it
was stalled. I had to manually kill sendbackup on that client in order
for the rest of the disklist to continue it's backup.
I will do some more debugging... In the meanwhile, if nobody
can figure out what it is, I would also appreciate advice on what info
to gather, so that you can better help debug.

 
 Tony Magni wrote:
 
  Again, only on one of my linux boxes, amcheck returns no erros, file
  size estimation goes all well. Backup starts, then hangs (just for the
  one computer). I can backup the /boot partition fine (it's only a few
  megs), but not the / and /home. I get strangenesses from sendbackup:
 
  /-- myhostname  /dev/sda2 lev 0 FAILED [data timeout]
  ? dumper: strange [missing size line from sendbackup]
  ? dumper: strange [missing end line from sendbackup]
  \
 
  and:
 
  /-- myhostname  /home lev 0 FAILED [data timeout]
  sendbackup: start [discordia:/home level 0]
  sendbackup: info BACKUP=/sbin/dump
  sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/sbin/restore -f... -
  sendbackup: info COMPRESS_SUFFIX=.gz
  sendbackup: info end
  |   DUMP: Date of this level 0 dump: Sun Jan 14 21:31:56 2001
  |   DUMP: Date of last level 0 dump: the epoch
  |   DUMP: Dumping /dev/sdb1 (/home) to standard output
  |   DUMP: Label: none
  |   DUMP: mapping (Pass I) [regular files]
  |   DUMP: mapping (Pass II) [directories]
  |   DUMP: estimated 3704606 tape blocks.
  |   DUMP: Volume 1 started at: Sun Jan 14 21:32:41 2001
  |   DUMP: dumping (Pass III) [directories]
  |   DUMP: dumping (Pass IV) [regular files]
  \
 
  To me it seems like dump is doing a good job. Also this from
  /tmp/amanda/amanda.debug: all seems well up until here:
 
  amandahosts security check passed
  amandad: running service "/usr/lib/amanda/sendbackup"
  amandad: sending REP packet:
  
  Amanda 2.4 REP HANDLE 000-38D00508 SEQ 979605303
  CONNECT DATA 3041 MESG 3042 INDEX 3043
  OPTIONS ;compress-fast;bsd-auth;index;
  
 
  amandad: waiting for ack: timeout, retrying
  amandad: got ack:
  
  Amanda 2.4 ACK HANDLE 000-38D00508 SEQ 979605303
  
 
  amandad: pid 31153 finish time Mon Jan 15 20:02:05 2001
 
  Where it had to retry to get ack. but then in sendbackup.debug:
 
  sendbackup: started index creator: "/sbin/restore -tvf - 21 | sed -e '
  s/^leaf[]*[0-9]*[   ]*\.//
  t
  /^dir[  ]/ {
  s/^dir[ ]*[0-9]*[   ]*\.//
  s%$%/%
  t
  }
  d
  '"
  index tee cannot write [Broken pipe]
 
  And I get [data timeout] as output of amstatus.
 
  Ideas?
 
  Thank you for your support. Let me know if any of these questions are
  worth posting on the newsgroup. I don't post there, since I am not
  sure if these are common knowledge questions or not, although I have
  not found the answer anywhere.
 
  --
  Tony Magni
  Department of Neurological Surgery
  University Hospitals of Cleveland
  (216)844-1306, [EMAIL PROTECTED],
  http://discordia.cwru.edu/tinton/
 
   - "I know Cleveland: I spent a year there one night!"
(Bill, San Francisco, 1994)
 

-- 
Tony Magni
Department of Neurological Surgery
University Hospitals of Cleveland
(216)844-1306, [EMAIL PROTECTED], 
http://discordia.cwru.edu/tinton/  

 - "I know Cleveland: I spent a year there one night!"
  (Bill, San Francisco, 1994)




Re: amanda: lev 0 FAILED [data timeout]

2001-01-19 Thread John R. Jackson

   I will do some more debugging... In the meanwhile, if nobody
can figure out what it is, I would also appreciate advice on what info
to gather, so that you can better help debug.

Attach a debugger to the sendbackup process and use "where" to get a
call stack.  Ditto for the dumper it is connected to on the server.

It would help if you had the client and server built with -g (debugging
symbols).

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]



amanda: lev 0 FAILED [data timeout]

2001-01-16 Thread Tony Magni

Again, only on one of my linux boxes, amcheck returns no erros, file
size estimation goes all well. Backup starts, then hangs (just for the
one computer). I can backup the /boot partition fine (it's only a few
megs), but not the / and /home. I get strangenesses from sendbackup:

/-- myhostname  /dev/sda2 lev 0 FAILED [data timeout]
? dumper: strange [missing size line from sendbackup]
? dumper: strange [missing end line from sendbackup]
\

and:

/-- myhostname  /home lev 0 FAILED [data timeout]
sendbackup: start [discordia:/home level 0]
sendbackup: info BACKUP=/sbin/dump
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/sbin/restore -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Date of this level 0 dump: Sun Jan 14 21:31:56 2001
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/sdb1 (/home) to standard output
|   DUMP: Label: none
|   DUMP: mapping (Pass I) [regular files]
|   DUMP: mapping (Pass II) [directories]
|   DUMP: estimated 3704606 tape blocks.
|   DUMP: Volume 1 started at: Sun Jan 14 21:32:41 2001
|   DUMP: dumping (Pass III) [directories]
|   DUMP: dumping (Pass IV) [regular files]
\

To me it seems like dump is doing a good job. Also this from
/tmp/amanda/amanda.debug: all seems well up until here:

amandahosts security check passed
amandad: running service "/usr/lib/amanda/sendbackup"
amandad: sending REP packet:

Amanda 2.4 REP HANDLE 000-38D00508 SEQ 979605303
CONNECT DATA 3041 MESG 3042 INDEX 3043
OPTIONS ;compress-fast;bsd-auth;index;


amandad: waiting for ack: timeout, retrying
amandad: got ack:

Amanda 2.4 ACK HANDLE 000-38D00508 SEQ 979605303


amandad: pid 31153 finish time Mon Jan 15 20:02:05 2001

Where it had to retry to get ack. but then in sendbackup.debug:

sendbackup: started index creator: "/sbin/restore -tvf - 21 | sed -e '
s/^leaf[]*[0-9]*[   ]*\.//
t
/^dir[  ]/ {
s/^dir[ ]*[0-9]*[   ]*\.//
s%$%/%
t
}
d
'"
index tee cannot write [Broken pipe]

And I get [data timeout] as output of amstatus.

Ideas?

Thank you for your support. Let me know if any of these questions are
worth posting on the newsgroup. I don't post there, since I am not
sure if these are common knowledge questions or not, although I have
not found the answer anywhere.

-- 
Tony Magni
Department of Neurological Surgery
University Hospitals of Cleveland
(216)844-1306, [EMAIL PROTECTED], 
http://discordia.cwru.edu/tinton/  

 - "I know Cleveland: I spent a year there one night!"
  (Bill, San Francisco, 1994)