Re: failure "estimate of level x timed out"

2012-12-11 Thread Charles Stroom
Today I had put back the parameter estimate to "client" as it always
had been, while etimeout is still at the new value of 14400.  Backup
failed again, even on 4 DLE's.

I had changed "estimate" back because although the backup succeeded
yesterday, it had this strange behaviour that it left 200Mb to be
flushed, while it had only written 15% in total to the 20GB DDS4
backup tape, so ample space.

Back to "estimate calcsize" and see if at least it runs correctly
tomorrow.

Regards, Charles



On Tue, 11 Dec 2012 21:14:57 +0300
Alan Orth  wrote:

> Charles,
> 
> Ah, I was mistaken that the error was not fatal due to the fact that
> the summary email says "output size" (listing around 1TB of data over
> 6 hours of backup time!) and "these dumps were to tape xxx."
> 
> Well if they are indeed failures then the FAIL classification is
> right indeed.  Good to know!  I really need to investigate my
> estimate timeouts then...
> 
> Cheers,
> 
> Alan
> 
> On 12/11/2012 12:56 PM, Charles Stroom wrote:
> > The planner has an "ERROR" to make the estimate, but than later the
> > dump itself FAILs as well. So no backup is made of that particular
> > DLE.
> >
> > Regards, Charles
> >
> >
> >
> > On Tue, 11 Dec 2012 09:04:46 +0300
> > Alan Orth  wrote:
> >
> >> Hi, All.
> >>
> >> It's good that you brought this up on the mailing list, I was just
> >> about to ask!  I've been having problems with estimation timeouts
> >> lately too, so I'll try some of these tips to fix it.
> >>
> >> What confused me initially was why estimation failures are
> >> classified as "FAIL"?  It's quite worrying when you wake up in the
> >> morning to find last night's backups have FAILED.  Shouldn't the
> >> classification be more similar to something like the STRANGE
> >> errors (where files have changed during backup, for example)?
> >>
> >> Cheers,
> >>
> >> Alan
> >>
> >> On 12/10/2012 06:05 PM, Charles Stroom wrote:
> >>> Hi, the forwarded email below was meant to go to the list, but I
> >>> noticed later it was only to 1 recepient.  Hence the forward.
> >>>
> >>> Regards, Charles
> >>>
> >>>
> >>>
> >>> Begin forwarded message:
> >>>
> >>> Date: Sat, 8 Dec 2012 22:08:48 +0100
> >>> From: Charles Stroom 
> >>> To: Jens Berg 
> >>> Subject: Re: failure "estimate of level x timed out"
> >>>
> >>>
> >>> So far, so good.  Since I have increased etimeout to 14400 AND set
> >>> "estimate calcsize" I have had 2 backups without failures. The
> >>> only thing I don't know yet which parameter did the trick.  I
> >>> keep my fingers crossed.
> >>>
> >>> Thanks both of you.
> >>>
> >>> Charles
> >>>
> >>>
> >>>
> >>> On Thu, 06 Dec 2012 09:49:28 +0100
> >>> Jens Berg  wrote:
> >>>
>  I would suggest to increase etimeout to a much bigger value,
>  let's say 14400 or so and see if the estimates finish at all
>  then. If they still fail, I would take a closer look on the
>  health of the hard discs... Another option could be to change
>  the estimate method for the dump type you are using, e.g. if you
>  are using "dumptype user-tar" for the DLEs, put an "estimate
>  calcsize" in the definition of "dumptype user-tar". The results
>  of that estimate method will be less accurate than the ones from
>  the default method but it executes faster.
> 
>  Best
>  Jens
> 
> >> -- 
> >> Alan Orth
> >> alan.o...@gmail.com
> >> http://alaninkenya.org
> >> http://mjanja.co.ke
> >> "I have always wished for my computer to be as easy to use as my
> >> telephone; my wish has come true because I can no longer figure out
> >> how to use my telephone." -Bjarne Stroustrup, inventor of C++
> >>
> >
> 
> 
> -- 
> Alan Orth
> alan.o...@gmail.com
> http://alaninkenya.org
> http://mjanja.co.ke
> "I have always wished for my computer to be as easy to use as my
> telephone; my wish has come true because I can no longer figure out
> how to use my telephone." -Bjarne Stroustrup, inventor of C++
> 


-- 
Charles Stroom
email: charles at no-spam.stremen.xs4all.nl (remove the "no-spam.")


amrecover fails if DLE was compressed

2012-12-11 Thread Debra S Baddorf
Hi all.
I've just put amanda 2.6.1p2 and my existing (and long working) config files
onto a new machine,  and tested that it worked to both backup and recover.

Then I uninstalled the 2.6.1p2 and installed  amanda 3.3.2
Now I get a broken pipe in the amrecover log,  exactly after I answer the
"set owner/mode?" question   and the amrecover window hangs
until I control-C out of it.

but only if the DLE includes compression.If I turn off compression
and redo the backups,  a recover will succeed.  Even a recover which
involved 2 tapes.

versions:
tar (GNU tar) 1.23
gzip 1.3.12

Any idea what the problem is?
Deb

Here are the amrecover and amandad logs,  since they contain errors.
I have other logs too if you need them, but I don't see any complaints in them.
===
amrecover.debug

Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: pid 29473 ruid 0 euid 0 
version 3.3.2: start at Tue Dec 11 13:11:11 2012
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: config_overrides: conf daily
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: pid 29473 ruid 0 euid 0 
version 3.3.2: rename at Tue Dec 11 13:11:11 2012
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: security_getdriver(name=bsd) 
returns 0x7fc6596bd2e0
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
security_handleinit(handle=0xfd0260, driver=0x7fc6596bd2e0 (BSD))
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: dgram_bind: setting up a 
socket with family 2
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: bind_portrange2: Skip port 
848: Owned by gdoi.
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: bind_portrange2: Try  port 
849: Available - Success
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: dgram_bind: socket 3 bound 
to 0.0.0.0:849
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
dgram_send_addr(addr=0xfd02a0, dgram=0x7fc6596c9da8)
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: (sockaddr_in *)0xfd02a0 = { 
2, 10080, 131.225.121.103 }
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: dgram_send_addr: 
0x7fc6596c9da8->socket = 3
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
dgram_recv(dgram=0x7fc6596c9da8, timeout=0, fromaddr=0x7fc6596d9da0)
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: (sockaddr_in 
*)0x7fc6596d9da0 = { 2, 10080, 131.225.121.103 }
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
dgram_recv(dgram=0x7fc6596c9da8, timeout=0, fromaddr=0x7fc6596d9da0)
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: (sockaddr_in 
*)0x7fc6596d9da0 = { 2, 10080, 131.225.121.103 }
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
dgram_send_addr(addr=0xfd02a0, dgram=0x7fc6596c9da8)
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: (sockaddr_in *)0xfd02a0 = { 
2, 10080, 131.225.121.103 }
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: dgram_send_addr: 
0x7fc6596c9da8->socket = 3
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
security_streaminit(stream=0xfd7840, driver=0x7fc6596bd2e0 (BSD))
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: make_socket opening socket 
with family 2
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: connect_port: Try  port 
5: available - Success
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: connected to 
131.225.121.103:50006
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: our side is 0.0.0.0:5
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: try_socksize: send buffer 
size is 65536
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: try_socksize: receive buffer 
size is 65536
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: 
security_close(handle=0xfd0260, driver=0x7fc6596bd2e0 (BSD))
Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: sending: FEATURES 
9efefbff1f


Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: sending: DATE 2012-12-11


Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: sending: SCNF daily


Tue Dec 11 13:11:11 2012: thd-0xfc4490: amrecover: sending: HOST mynode.fqdn


Tue Dec 11 13:11:19 2012: thd-0xfc4490: amrecover: user command: 'setdate 
2012-12-06'
Tue Dec 11 13:11:19 2012: thd-0xfc4490: amrecover: sending: DATE 2012-12-06


Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: user command: 'setdisk /var'
Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: sending: DISK /var


Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: sending: OISD /


Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: sending: OLSD /


Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: add_dir_list_item: Adding 
"2012-12-06-12-56-36" "0" "adUXdaily-0daily-827test:3" "3" "/."
Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: add_dir_list_item: Adding 
"2012-12-06-12-56-36" "0" "adUXdaily-0daily-827test:3" "3" "/account/"
Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: add_dir_list_item: Adding 
"2012-12-06-12-56-36" "0" "adUXdaily-0daily-827test:3" "3" "/adm/"
Tue Dec 11 13:11:24 2012: thd-0xfc4490: amrecover: add_dir_list_item: Adding 
"2012-12-06-12-56-36" "0" "adUXdaily-0daily-827test:3" "3" "/cach

Re: failure "estimate of level x timed out"

2012-12-11 Thread Alan Orth

Charles,

Ah, I was mistaken that the error was not fatal due to the fact that the 
summary email says "output size" (listing around 1TB of data over 6 
hours of backup time!) and "these dumps were to tape xxx."


Well if they are indeed failures then the FAIL classification is right 
indeed.  Good to know!  I really need to investigate my estimate 
timeouts then...


Cheers,

Alan

On 12/11/2012 12:56 PM, Charles Stroom wrote:

The planner has an "ERROR" to make the estimate, but than later the
dump itself FAILs as well. So no backup is made of that particular DLE.

Regards, Charles



On Tue, 11 Dec 2012 09:04:46 +0300
Alan Orth  wrote:


Hi, All.

It's good that you brought this up on the mailing list, I was just
about to ask!  I've been having problems with estimation timeouts
lately too, so I'll try some of these tips to fix it.

What confused me initially was why estimation failures are classified
as "FAIL"?  It's quite worrying when you wake up in the morning to
find last night's backups have FAILED.  Shouldn't the classification
be more similar to something like the STRANGE errors (where files
have changed during backup, for example)?

Cheers,

Alan

On 12/10/2012 06:05 PM, Charles Stroom wrote:

Hi, the forwarded email below was meant to go to the list, but I
noticed later it was only to 1 recepient.  Hence the forward.

Regards, Charles



Begin forwarded message:

Date: Sat, 8 Dec 2012 22:08:48 +0100
From: Charles Stroom 
To: Jens Berg 
Subject: Re: failure "estimate of level x timed out"


So far, so good.  Since I have increased etimeout to 14400 AND set
"estimate calcsize" I have had 2 backups without failures. The only
thing I don't know yet which parameter did the trick.  I keep my
fingers crossed.

Thanks both of you.

Charles



On Thu, 06 Dec 2012 09:49:28 +0100
Jens Berg  wrote:


I would suggest to increase etimeout to a much bigger value, let's
say 14400 or so and see if the estimates finish at all then. If
they still fail, I would take a closer look on the health of the
hard discs... Another option could be to change the estimate
method for the dump type you are using, e.g. if you are using
"dumptype user-tar" for the DLEs, put an "estimate calcsize" in
the definition of "dumptype user-tar". The results of that
estimate method will be less accurate than the ones from the
default method but it executes faster.

Best
Jens


--
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my
telephone; my wish has come true because I can no longer figure out
how to use my telephone." -Bjarne Stroustrup, inventor of C++






--
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my telephone; my wish 
has come true because I can no longer figure out how to use my telephone." -Bjarne 
Stroustrup, inventor of C++



Re: failure "estimate of level x timed out"

2012-12-11 Thread Charles Stroom
The planner has an "ERROR" to make the estimate, but than later the
dump itself FAILs as well. So no backup is made of that particular DLE.

Regards, Charles



On Tue, 11 Dec 2012 09:04:46 +0300
Alan Orth  wrote:

> Hi, All.
> 
> It's good that you brought this up on the mailing list, I was just
> about to ask!  I've been having problems with estimation timeouts
> lately too, so I'll try some of these tips to fix it.
> 
> What confused me initially was why estimation failures are classified
> as "FAIL"?  It's quite worrying when you wake up in the morning to
> find last night's backups have FAILED.  Shouldn't the classification
> be more similar to something like the STRANGE errors (where files
> have changed during backup, for example)?
> 
> Cheers,
> 
> Alan
> 
> On 12/10/2012 06:05 PM, Charles Stroom wrote:
> > Hi, the forwarded email below was meant to go to the list, but I
> > noticed later it was only to 1 recepient.  Hence the forward.
> >
> > Regards, Charles
> >
> >
> >
> > Begin forwarded message:
> >
> > Date: Sat, 8 Dec 2012 22:08:48 +0100
> > From: Charles Stroom 
> > To: Jens Berg 
> > Subject: Re: failure "estimate of level x timed out"
> >
> >
> > So far, so good.  Since I have increased etimeout to 14400 AND set
> > "estimate calcsize" I have had 2 backups without failures. The only
> > thing I don't know yet which parameter did the trick.  I keep my
> > fingers crossed.
> >
> > Thanks both of you.
> >
> > Charles
> >
> >
> >
> > On Thu, 06 Dec 2012 09:49:28 +0100
> > Jens Berg  wrote:
> >
> >> I would suggest to increase etimeout to a much bigger value, let's
> >> say 14400 or so and see if the estimates finish at all then. If
> >> they still fail, I would take a closer look on the health of the
> >> hard discs... Another option could be to change the estimate
> >> method for the dump type you are using, e.g. if you are using
> >> "dumptype user-tar" for the DLEs, put an "estimate calcsize" in
> >> the definition of "dumptype user-tar". The results of that
> >> estimate method will be less accurate than the ones from the
> >> default method but it executes faster.
> >>
> >> Best
> >> Jens
> >>
> 
> -- 
> Alan Orth
> alan.o...@gmail.com
> http://alaninkenya.org
> http://mjanja.co.ke
> "I have always wished for my computer to be as easy to use as my
> telephone; my wish has come true because I can no longer figure out
> how to use my telephone." -Bjarne Stroustrup, inventor of C++
> 


-- 
Charles Stroom
email: charles at no-spam.stremen.xs4all.nl (remove the "no-spam.")