Re: intermittent amanda failure

2010-03-09 Thread Steve Wray

Gene Heskett wrote:

On Tuesday 09 March 2010, Dustin J. Mitchell wrote:

On Wed, Mar 3, 2010 at 1:58 PM, Steve Wray steve.w...@cwa.co.nz wrote:

Right, so the LATEST most up-to-date version of Debian uses a 3 year old
version of amanda. Fantastic, thanks Debian for keeping things so
'stable'.

To be fair, that's exactly the intent, and maintaining a Linux
distribution is *not* easy.  All of the binary-only distros are
behind the times to varying degrees, although Debian is usually
bringing up the rear of the bunch.


I downloaded the actual latest stable version of amanda (2.6.1p2 from
November 2009), compiled it and tested it.

No bug.

Yay!


Thanks, Debian package maintainer. Not.

[snip]

If there are Amanda bugs that are holding back a version bump, please
let me know.  At the moment, I only see two open bugs, one from 2006
and one from 2008, neither of which is blocking a bump.

 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=500364
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=370319

Dustin


I'm on your side here Dustin.  The distros, debian in particular need 
prodding.  What you use for a prod is up to you. :-)


The problem I have with submitting bug reports to Debian on this sort of 
thing is this:


If the bug is not security related then its extremely unlikely to be fixed 
until the NEXT stable release.


In Debian, stability means making sure that non-security bugs are 
maintained throughout the lifetime of the release. The bugs are *part* of 
the stability. The theory is that people may have implemented workarounds 
for these bugs. If you go fixing the bug then you break their workaround.


Since, due to this (and other ongoing concerns with the 'stability' 
problems of Debian), I will not be using the next stable release.


So why would I bother to point out to them that, hey, maybe when you 
release the next Debian version you use the current version of Amanda? if 
I am not going to be using that version? This would be purely altruistic... 
and if Debian can't figure this out for themselves well... to be frank, I 
have no time for that.



I go back 11+ years with amanda, usually running the bleeding edge as now.  
Considering that I build the new snapshot and use it nightly several times a 
week, the number of real bugs has been almost vanishingly small even when its 
labeled as alpha, not for production use.  FWIW, 90% of those were tar's 
fault, not amanda's.  There are several tar versions about, not all of which 
are even compatible with themselves.  Amanda is compatible with itself with 
one exception, a format change a good 8 or 9 years ago.  Folks like Dustin 
and Jean-Louis write tight, and correct code.  I mentally salute them as I 
toss last nights printout on top of the stack (should, heaven forbid, I need 
to consult it) every morning.


That reminds me; in one release of Debian the version of tar and of amanda 
were incompatible! It was the 'tar gives exit status 1 if a file changed 
while being read' problem IIRC.


This was NEVER fixed in that 'stable' release. I should have seen the 
writing on the wall, really.




--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-03-09 Thread Gene Heskett
On Tuesday 09 March 2010, Steve Wray wrote:
Gene Heskett wrote:
 On Tuesday 09 March 2010, Dustin J. Mitchell wrote:
 On Wed, Mar 3, 2010 at 1:58 PM, Steve Wray steve.w...@cwa.co.nz wrote:
 Right, so the LATEST most up-to-date version of Debian uses a 3 year
 old version of amanda. Fantastic, thanks Debian for keeping things so
 'stable'.

 To be fair, that's exactly the intent, and maintaining a Linux
 distribution is *not* easy.  All of the binary-only distros are
 behind the times to varying degrees, although Debian is usually
 bringing up the rear of the bunch.

 I downloaded the actual latest stable version of amanda (2.6.1p2 from
 November 2009), compiled it and tested it.

 No bug.

 Yay!

 Thanks, Debian package maintainer. Not.

[snip]

 If there are Amanda bugs that are holding back a version bump, please
 let me know.  At the moment, I only see two open bugs, one from 2006
 and one from 2008, neither of which is blocking a bump.

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=500364
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=370319

 Dustin

 I'm on your side here Dustin.  The distros, debian in particular need
 prodding.  What you use for a prod is up to you. :-)

The problem I have with submitting bug reports to Debian on this sort of
thing is this:

If the bug is not security related then its extremely unlikely to be fixed
until the NEXT stable release.

In Debian, stability means making sure that non-security bugs are
maintained throughout the lifetime of the release. The bugs are *part* of
the stability.

What?  Crazy theory, but probably correct, in the debian camp that is.

The theory is that people may have implemented workarounds
for these bugs. If you go fixing the bug then you break their workaround.

Since, due to this (and other ongoing concerns with the 'stability'
problems of Debian), I will not be using the next stable release.

I probably will.  For one thing, the 6.06 release and the accompanying 
releases of emc, which run my milling machine, haven't been getting all the 
loving that the 8.04 LTS version is getting, and at some point I'll have to 
update just so I can use the newer versions of EMC.  OTOH, so far it is doing 
everything I ask it to do, soo.

So why would I bother to point out to them that, hey, maybe when you
release the next Debian version you use the current version of Amanda? if
I am not going to be using that version? This would be purely altruistic...
and if Debian can't figure this out for themselves well... to be frank, I
have no time for that.

Can't say as I blame you.  Stone walls aren't much fun to talk to.

 I go back 11+ years with amanda, usually running the bleeding edge as
 now. Considering that I build the new snapshot and use it nightly several
 times a week, the number of real bugs has been almost vanishingly small
 even when its labeled as alpha, not for production use.  FWIW, 90% of
 those were tar's fault, not amanda's.  There are several tar versions
 about, not all of which are even compatible with themselves.  Amanda is
 compatible with itself with one exception, a format change a good 8 or 9
 years ago.  Folks like Dustin and Jean-Louis write tight, and correct
 code.  I mentally salute them as I toss last nights printout on top of
 the stack (should, heaven forbid, I need to consult it) every morning.

That reminds me; in one release of Debian the version of tar and of amanda
were incompatible! It was the 'tar gives exit status 1 if a file changed
while being read' problem IIRC.

This was NEVER fixed in that 'stable' release. I should have seen the
writing on the wall, really.

Yeah, we had to jump a tar version there, in addition to beating the tar 
folks about the brow, which I did at the time.  They said they weren't gonna 
fix it, but then the crowd roar got to them and they did eventually. ;-)

-- 
Cheers Dustin, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)

For most men life is a search for the proper manila envelope in which to
get themselves filed.
-- Clifton Fadiman


Re: intermittent amanda failure

2010-03-08 Thread Dustin J. Mitchell
On Wed, Mar 3, 2010 at 1:58 PM, Steve Wray steve.w...@cwa.co.nz wrote:
 Right, so the LATEST most up-to-date version of Debian uses a 3 year old
 version of amanda. Fantastic, thanks Debian for keeping things so 'stable'.

To be fair, that's exactly the intent, and maintaining a Linux
distribution is *not* easy.  All of the binary-only distros are
behind the times to varying degrees, although Debian is usually
bringing up the rear of the bunch.

 I downloaded the actual latest stable version of amanda (2.6.1p2 from
 November 2009), compiled it and tested it.

 No bug.

Yay!

 Thanks, Debian package maintainer. Not.

 Backup software is mission critical. Failing to track the upstream to this
 extent is simply unforgivable. I'm revising my opinion of Debian.

I hear from a *lot* of folks on #amanda in exactly the same situation as you.

Please do consider contacting the maintainer, or perhaps other Debian
maintainers that might be able to poke the maintainer more
effectively.  I, as an upstream developer, don't have much impact on
distro maintainers - apparently why don't you ship the latest
release?! is a common refrain from upstreams.  Distros aren't
democracies, but they do listen to their users, and if enough people
are asking why hasn't Amanda been bumped in 3 years? then someone
with commit access will step up to take care of it.

If there are Amanda bugs that are holding back a version bump, please
let me know.  At the moment, I only see two open bugs, one from 2006
and one from 2008, neither of which is blocking a bump.

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=500364
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=370319

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: intermittent amanda failure

2010-03-08 Thread Gene Heskett
On Tuesday 09 March 2010, Dustin J. Mitchell wrote:
On Wed, Mar 3, 2010 at 1:58 PM, Steve Wray steve.w...@cwa.co.nz wrote:
 Right, so the LATEST most up-to-date version of Debian uses a 3 year old
 version of amanda. Fantastic, thanks Debian for keeping things so
 'stable'.

To be fair, that's exactly the intent, and maintaining a Linux
distribution is *not* easy.  All of the binary-only distros are
behind the times to varying degrees, although Debian is usually
bringing up the rear of the bunch.

 I downloaded the actual latest stable version of amanda (2.6.1p2 from
 November 2009), compiled it and tested it.

 No bug.

Yay!

 Thanks, Debian package maintainer. Not.

 Backup software is mission critical. Failing to track the upstream to
 this extent is simply unforgivable. I'm revising my opinion of Debian.

I hear from a *lot* of folks on #amanda in exactly the same situation as
 you.

Amanda is not one of the distros favorite applications, having its own 
security model that has been at odds with the constraints of the packaging 
systems. Rpm in particular broke it regularly, and I long ago gave up helping 
the rpm folks who were determined to bend amanda to suit them.

The tarball, OTOH, lends itself to the enterprising bash script writer, who 
can then install and test check the latest version of amanda in about 3, 
maybe 4 minutes on a fast machine using ccache.  In fact I just installed the 
20100308 snapshot of amanda-3.2alpha.  And I used the same pair of scripts 
that has been installing amanda for me since about 2.5.1.

Please do consider contacting the maintainer, or perhaps other Debian
maintainers that might be able to poke the maintainer more
effectively.  I, as an upstream developer, don't have much impact on
distro maintainers - apparently why don't you ship the latest
release?! is a common refrain from upstreams.  Distros aren't
democracies, but they do listen to their users, and if enough people
are asking why hasn't Amanda been bumped in 3 years? then someone
with commit access will step up to take care of it.

or you could build the tarball, which very nicely auto-configures amanda to 
run optimally on _your_ system.  You have to setup a couple of files, maybe 
3, and then just let the crontab of the amanda user take over from there.  If 
configured to send the operator an email, you can 'read all about it' the 
next morning with your first cuppa.  Whats not to like?

My scripts are available, just ask.  They are not 'big' scripts either.

If there are Amanda bugs that are holding back a version bump, please
let me know.  At the moment, I only see two open bugs, one from 2006
and one from 2008, neither of which is blocking a bump.

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=500364
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=370319

Dustin

I'm on your side here Dustin.  The distros, debian in particular need 
prodding.  What you use for a prod is up to you. :-)

I go back 11+ years with amanda, usually running the bleeding edge as now.  
Considering that I build the new snapshot and use it nightly several times a 
week, the number of real bugs has been almost vanishingly small even when its 
labeled as alpha, not for production use.  FWIW, 90% of those were tar's 
fault, not amanda's.  There are several tar versions about, not all of which 
are even compatible with themselves.  Amanda is compatible with itself with 
one exception, a format change a good 8 or 9 years ago.  Folks like Dustin 
and Jean-Louis write tight, and correct code.  I mentally salute them as I 
toss last nights printout on top of the stack (should, heaven forbid, I need 
to consult it) every morning.

-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)

Visits always give pleasure: if not on arrival, then on the departure.
-- Edouard Le Berquier, Pensees des Autres


Re: intermittent amanda failure

2010-03-03 Thread Steve Wray

I'd like to put in an update to this thread, for anyone interested.

The backup server had been running Debian Etch. The version of amanda on 
Etch was giving the errors described in this thread.


I upgraded the server to Debian Lenny. The problems still occured with the 
version in Lenny.


Then I noticed that the version of amanda in Lenny dates from 2007. It 
gives its version as:


1:2.5.2p1-4

When I look at the amanda.org download page and compare version numbers I 
see this:


2.5.2p1 June 6 2007


Right, so the LATEST most up-to-date version of Debian uses a 3 year old 
version of amanda. Fantastic, thanks Debian for keeping things so 'stable'.


I downloaded the actual latest stable version of amanda (2.6.1p2 from 
November 2009), compiled it and tested it.


No bug.

Thanks, Debian package maintainer. Not.

Backup software is mission critical. Failing to track the upstream to this 
extent is simply unforgivable. I'm revising my opinion of Debian.





Steve Wray wrote:

Dustin J. Mitchell wrote:

On Thu, Jan 21, 2010 at 5:05 PM, Jean-Louis Martineau
martin...@zmanda.com wrote:
xinetd is still configured to accept a tcp connection, but amandad 
expect a
udp packet, so amandad do nothing and the server fail while waiting 
for an

ACK.


Right - it was the failure I expected to see, not Steve's bug.


I'm not able to replicate Steve bug, many fix have been committed since
2.5.0p2.


Good point. By my back-of-the-envelope calculations, we've fixed
something like 500 bugs in the 4 years since 2.5.0 was released.  Of
course, we *introduced* some of of those bugs in those 4 years, too :)

Steve: if you don't see something obvious that I missed in my
replication effort, can you give this a try with a 2.6.1p2 server?


I'm not able to make this test yet, perhaps later in the week.

However, I've discovered that the amcheck problem may not actually 
reflect a genuine problem that affects backup.


The other day, the cronjob amcheck reported these errors -- so for one 
thing its clearly intermittent.


But the thing is that the backup job that ran that very night completed 
with no problems at all.






--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-25 Thread Steve Wray

Dustin J. Mitchell wrote:

On Thu, Jan 21, 2010 at 5:05 PM, Jean-Louis Martineau
martin...@zmanda.com wrote:

xinetd is still configured to accept a tcp connection, but amandad expect a
udp packet, so amandad do nothing and the server fail while waiting for an
ACK.


Right - it was the failure I expected to see, not Steve's bug.


I'm not able to replicate Steve bug, many fix have been committed since
2.5.0p2.


Good point. By my back-of-the-envelope calculations, we've fixed
something like 500 bugs in the 4 years since 2.5.0 was released.  Of
course, we *introduced* some of of those bugs in those 4 years, too :)

Steve: if you don't see something obvious that I missed in my
replication effort, can you give this a try with a 2.6.1p2 server?


I'm not able to make this test yet, perhaps later in the week.

However, I've discovered that the amcheck problem may not actually reflect 
a genuine problem that affects backup.


The other day, the cronjob amcheck reported these errors -- so for one 
thing its clearly intermittent.


But the thing is that the backup job that ran that very night completed 
with no problems at all.



--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-21 Thread Dustin J. Mitchell
On Wed, Jan 20, 2010 at 5:43 PM, Steve Wray steve.w...@cwa.co.nz wrote:
 The problem I had before was, to reiterate:

 Disklist with two entries.

 One entry uses bsdtcp

 The other entry uses BSD.

 If the client of the disklist entry that is configured on the server to use
 bsdtcp is not configured to use bsdtcp on the CLIENT in /etc/inetd.conf then
 the client which is configured to use default authentication returns errors.

OK, this seems to be the simplest formation of this problem of errors
with bsdtcp bleeding over into other, non-bsdtcp hosts.  I tried to
replicate it as follows.

Disklist:

euclid /etc {
sysconfig-part
auth bsdtcp
}

knuth /etc {
sysconfig-part
auth bsd
}

This works fine when both hosts are properly configured:
Client check: 2 hosts checked in 1.130 seconds.  0 problems found.

If I restart euclid's inetd config to run it with the wrong -auth
parameter, then amcheck says:
WARNING: euclid: selfcheck request failed: timeout waiting for ACK
Client check: 2 hosts checked in 30.030 seconds.  1 problem found.

so it doesn't give me any problems with knuth.  If I stop xinetd on
euclid (functionally equivalent to configuring it for bsd auth, since
in bsd auth it would be listening on UDP port 10080, not TCP port
10080), the result is about the same:
WARNING: euclid: selfcheck request failed: Connection refused
Client check: 2 hosts checked in 10.047 seconds.  1 problem found.

Am I missing something in the replication recipe?

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: intermittent amanda failure

2010-01-21 Thread Jean-Louis Martineau

Dustin J. Mitchell wrote:

If I restart euclid's inetd config to run it with the wrong -auth
parameter, then amcheck says:
WARNING: euclid: selfcheck request failed: timeout waiting for ACK
Client check: 2 hosts checked in 30.030 seconds.  1 problem found.
  
xinetd is still configured to accept a tcp connection, but amandad 
expect a udp packet, so amandad do nothing and the server fail while 
waiting for an ACK.


You must also change the following in xinetd:
   socket_type = 
dgram   
   protocol= 
udp 
   wait= 
yes 



I'm not able to replicate Steve bug, many fix have been committed since 
2.5.0p2.


Jean-Louis


Re: intermittent amanda failure

2010-01-21 Thread Dustin J. Mitchell
On Thu, Jan 21, 2010 at 5:05 PM, Jean-Louis Martineau
martin...@zmanda.com wrote:
 xinetd is still configured to accept a tcp connection, but amandad expect a
 udp packet, so amandad do nothing and the server fail while waiting for an
 ACK.

Right - it was the failure I expected to see, not Steve's bug.

 I'm not able to replicate Steve bug, many fix have been committed since
 2.5.0p2.

Good point. By my back-of-the-envelope calculations, we've fixed
something like 500 bugs in the 4 years since 2.5.0 was released.  Of
course, we *introduced* some of of those bugs in those 4 years, too :)

Steve: if you don't see something obvious that I missed in my
replication effort, can you give this a try with a 2.6.1p2 server?

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: intermittent amanda failure

2010-01-20 Thread Steve Wray
I am going to try and resurrect this thread having been able to home in on 
the apparent bug I found with bsdtcp auth.


I have now converted most of our systems to using bsdtcp and amcheck is 
showing no errors at this time.


The problem I had before was, to reiterate:

Disklist with two entries.

One entry uses bsdtcp

The other entry uses BSD.

If the client of the disklist entry that is configured on the server to use 
bsdtcp is not configured to use bsdtcp on the CLIENT in /etc/inetd.conf 
then the client which is configured to use default authentication returns 
errors.


I now have a full disklist for all of our servers. Some are using bsdtcp 
others are using the default (which as I understand it is BSD).


If I go to one of the clients whose disklist entry on the server is for 
bsdtcp and I change its /etc/inetd.conf from this:


amanda stream tcp nowait backup /usr/lib/amanda/amandad amandad 
-auth=bsdtcp amdump


to this:

amanda dgram udp wait backup /usr/sbin/tcpd /usr/lib/amanda/amandad

then ALL other disklist entries which are NOT using bsdtcp return errors on 
amcheck -c


Disklist entries which are configured to use bsdtcp are not affected and do 
not generate errors, only those using BSD auth.


This is just a change in inetd.conf on ONE client.

The errors for the other disklist entries are:

ERROR: an amanda client: [access as backup not allowed from root@the 
amanda server] amandahostsauth failed



I have no idea if this is an amanda problem or a Debian packaging problem 
or what it is. But its repeatable.



Version info for the amanda-server package on the amanda server is:

r...@fileserver:~# dpkg -s amanda-server
Package: amanda-server
Status: install ok installed
Priority: optional
Section: utils
Installed-Size: 1216
Maintainer: Bdale Garbee bd...@gag.com
Architecture: i386
Source: amanda
Version: 1:2.5.2p1-5



Steve Wray wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:
Run 'amadmin CONFIG disklist' and check the auth is set as expected 
for all dles.


I've done this, with the amanda.conf having bsdudp and with it having 
bsdtcp for that entry.


In both cases all auth entries for all other DLE's are 'BSD'.

In both cases only that one DLE is reported as having either bsdtcp or 
bsdudp, in both cases matching what is in the amanda.conf


So I'd say that was all as expected.



Having changed this one DLE to bsdudp I'm still seeing intermittent 
problems with the nightly backup run.


I'd like to change to bsdtcp as it seems more robust but as I've 
mentioned there were some serious problems with that.


Any further advice or help would be appreciated.

Thanks.








Jean-Louis

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 
01:28:01 2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just 
one dle using bsdtcp auth? That they would all have to have it? 
(ie the inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.


We are going around in circles a little here.

Allow me to try to make things very clear.

In my amanda.conf I have a dumptype defined as such:

(I've included the parent dumptypes. The 'global' dumptype is empty)

define dumptype root-tar {
global
program GNUTAR
comment root partitions dumped with tar
compress none
index
exclude list /etc/amanda/exclude.gtar
priority low
}

define dumptype nocomp-root-tar {
root-tar
comment Root partitions without compression
compress none
}

define dumptype problem-nocomp-root-tar {
nocomp-root-tar
comment Root partitions without compression, problem client
compress none
auth bsdudp
#auth bsdtcp
}

There are several DLEs for clients using the 'nocomp-root-tar' 
dumptype and only *one* DLE for *one* client using the 
'problem-nocomp-root-tar' dumptype.


With the bsdudp line uncommented everything is happy with an amcheck.

With the bsdtcp line uncommented (and the bsdudp line commented out) 
*no* client is happy with the amcheck *other* than the client which 
uses 'problem-nocomp-root-tar'. However, as noted in another email 
this is intermittent, sometimes some clients using nocomp-root-tar 
are happy. So far its not exhibiting much pattern that I can see.


The above *does* include the fact that I *do* change the inetd.conf 
on the client which uses problem-nocomp-root-tar *and* restart inetd.


So, with a change of one line in a dumptype in a DLE used by one 
client, all other clients have problems.


Re: intermittent amanda failure

2010-01-11 Thread Steve Wray

Steve Wray wrote:

Jean-Louis Martineau wrote:
Run 'amadmin CONFIG disklist' and check the auth is set as expected 
for all dles.


I've done this, with the amanda.conf having bsdudp and with it having 
bsdtcp for that entry.


In both cases all auth entries for all other DLE's are 'BSD'.

In both cases only that one DLE is reported as having either bsdtcp or 
bsdudp, in both cases matching what is in the amanda.conf


So I'd say that was all as expected.



Having changed this one DLE to bsdudp I'm still seeing intermittent 
problems with the nightly backup run.


I'd like to change to bsdtcp as it seems more robust but as I've mentioned 
there were some serious problems with that.


Any further advice or help would be appreciated.

Thanks.








Jean-Louis

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 
2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just 
one dle using bsdtcp auth? That they would all have to have it? (ie 
the inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.


We are going around in circles a little here.

Allow me to try to make things very clear.

In my amanda.conf I have a dumptype defined as such:

(I've included the parent dumptypes. The 'global' dumptype is empty)

define dumptype root-tar {
global
program GNUTAR
comment root partitions dumped with tar
compress none
index
exclude list /etc/amanda/exclude.gtar
priority low
}

define dumptype nocomp-root-tar {
root-tar
comment Root partitions without compression
compress none
}

define dumptype problem-nocomp-root-tar {
nocomp-root-tar
comment Root partitions without compression, problem client
compress none
auth bsdudp
#auth bsdtcp
}

There are several DLEs for clients using the 'nocomp-root-tar' 
dumptype and only *one* DLE for *one* client using the 
'problem-nocomp-root-tar' dumptype.


With the bsdudp line uncommented everything is happy with an amcheck.

With the bsdtcp line uncommented (and the bsdudp line commented out) 
*no* client is happy with the amcheck *other* than the client which 
uses 'problem-nocomp-root-tar'. However, as noted in another email 
this is intermittent, sometimes some clients using nocomp-root-tar 
are happy. So far its not exhibiting much pattern that I can see.


The above *does* include the fact that I *do* change the inetd.conf 
on the client which uses problem-nocomp-root-tar *and* restart inetd.


So, with a change of one line in a dumptype in a DLE used by one 
client, all other clients have problems.



Perhaps I am misunderstanding something basic about dumptype 
configuration?












--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-07 Thread Brian Cuttler


On Wed, Jan 06, 2010 at 04:15:41PM -0600, Dustin J. Mitchell wrote:
 On Wed, Jan 6, 2010 at 4:01 PM, Steve Wray steve.w...@cwa.co.nz wrote:
  Am I to understand that there could be a problem in having 'too many' DLE's
  for bsd or bsdudp to cope with?
 
  I never thought of there being a limit to the number of DLE's before... Our
  disklist file has 178.
 
 Yes, it's quite possible, and quite common with folks who have an
 Amanda built with only a small range of available ports.

Caught my site, large number of DLE for a single client system,
overall number of DLE across multiple cleints not an issue. With
Dustin and Jean-Louis' direction I was able to config the single
problem client to use TCP protocal and its been smooth sailing
ever since.



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.




Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Steve Wray wrote:

Dustin J. Mitchell wrote:

I suspect an estimate or data timeout.  Have you tried increasing
dtimeout and etimeout?



etimeout 2000
dtimeout 2000

I'd be surprised. These seem like fairly substantial values. 2000 
seconds is roughly half an hour. I'll increase them by another 1000 
seconds though, to see what happens.


It failed again last night with 3000 second timeouts for these values.

How can I verify whether theres a problem with timeouts?

Thanks.



--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Jean-Louis Martineau

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Jean-Louis


Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


I just tried following the instructions here:

http://wiki.zmanda.com/index.php/How_To:Configure_bsd,_bsdtcp,_or_bsdudp_authentication

I've added a single new dle config for this host:

define dumptype problemserver-nocomp-root-tar {
nocomp-root-tar
comment Root partitions with compression
compress none
auth bsdtcp
}

and a single change to the disklist file to refer to this config entry for 
just that one single host.


When I 'su - backup' and run 'amcheck -c' I get errors such as:

ERROR: NAK an amanda client: user root from the amanda server is not 
allowed to execute the service noop: Please add amdump to the line in 
/var/backups/.amandahosts on the client


for *every* amanda client in the list.

It seems rather odd that this change for a single dle would introduce this 
error for every other dle? Its as if something has leaked out.


I made no changes to any other configuration yet, just the change to the 
amanda server modifying a dle for a single client.


I expected that it might fail for that client, but not for all of them...


(I'll get to posting the debug logs soon).




--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just one dle 
using bsdtcp auth? That they would all have to have it? (ie the inetd 
configuration)







Jean-Louis



--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Dustin J. Mitchell
On Wed, Jan 6, 2010 at 3:42 PM, Steve Wray steve.w...@cwa.co.nz wrote:
 Ah hang on, am I right in understanding that you can't have just one dle
 using bsdtcp auth? That they would all have to have it? (ie the inetd
 configuration)

Well, all DLEs on a given host have to have the same auth.  If you
define different host-level parameters for DLEs on the same host, the
results are undefined.

The inetd configuration is on the client, and has to be set up to
correspond with the auth you want to use.  Check out Paul Yeatman's
wonderful amanda-auth(7) for the nitty-gritty details.
  http://wiki.zmanda.com/man/amanda-auth.7.html

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: intermittent amanda failure

2010-01-06 Thread Jean-Louis Martineau

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just one 
dle using bsdtcp auth? That they would all have to have it? (ie the 
inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.

Jean-Louis




Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Jean-Louis Martineau wrote:

Steve Wray wrote:

Dustin J. Mitchell wrote:

On Wed, Jan 6, 2010 at 4:01 PM, Steve Wray steve.w...@cwa.co.nz wrote:
Am I to understand that there could be a problem in having 'too 
many' DLE's

for bsd or bsdudp to cope with?

I never thought of there being a limit to the number of DLE's 
before... Our

disklist file has 178.


Yes, it's quite possible, and quite common with folks who have an
Amanda built with only a small range of available ports.


Ok I'm not sure about amanda as built for Debian but would it be safer 
for me to change pretty much our entire system to use bsdtcp?


So far my experiments in this regard have not been great; adding just 
one single DLE using bsdtcp has caused all other DLE's to have problems.


You can't change only one dle, you must change all dles for that client, 
you must also change inetd/xinetd on that client, don't change it on the 
server.


This particular client has only one DLE.

The other DLE's which are having problems are all for other clients.

Ie: a change to bsdtcp in the DLE for one client has broken all other clients.



--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Dustin J. Mitchell wrote:

On Wed, Jan 6, 2010 at 3:42 PM, Steve Wray steve.w...@cwa.co.nz wrote:

Ah hang on, am I right in understanding that you can't have just one dle
using bsdtcp auth? That they would all have to have it? (ie the inetd
configuration)


Well, all DLEs on a given host have to have the same auth.  If you
define different host-level parameters for DLEs on the same host, the
results are undefined.


On a given host, sure.

In the case of bsdtcp it appears that there needs to be some config on the 
server to handle this (in inetd/xinetd on the server). I think thats what 
was causing the cascade of errors I posted before.




The inetd configuration is on the client, and has to be set up to
correspond with the auth you want to use.  Check out Paul Yeatman's
wonderful amanda-auth(7) for the nitty-gritty details.
  http://wiki.zmanda.com/man/amanda-auth.7.html


Yes I've been reading that. This caught my eye:

quote
bsd communication and authentication

The authentication is done using .amandahosts file in the Amanda user's 
home directory. The protocol between Amanda server and client is UDP. The 
number of disk list entries (DLEs)--number of Amanda clients--is limited by 
the UDP packet size. This authentication protocol will use a different port 
for each data stream (see PORT USAGE below)


bsdudp communication and authentication

The authentication is done using .amandahosts files in the Amanda user's 
home directory. It uses UDP protocol between Amanda server and client for 
data and hence the number of DLEs is limited by the UDP packet size. It 
uses one TCP port to establish the connection and multiplexes all data 
streams using one port on the server (see PORT USAGE below).

/quote


Am I to understand that there could be a problem in having 'too many' DLE's 
for bsd or bsdudp to cope with?


I never thought of there being a limit to the number of DLE's before... Our 
disklist file has 178.




--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Jean-Louis Martineau

Steve Wray wrote:

Dustin J. Mitchell wrote:

On Wed, Jan 6, 2010 at 4:01 PM, Steve Wray steve.w...@cwa.co.nz wrote:
Am I to understand that there could be a problem in having 'too 
many' DLE's

for bsd or bsdudp to cope with?

I never thought of there being a limit to the number of DLE's 
before... Our

disklist file has 178.


Yes, it's quite possible, and quite common with folks who have an
Amanda built with only a small range of available ports.


Ok I'm not sure about amanda as built for Debian but would it be safer 
for me to change pretty much our entire system to use bsdtcp?


So far my experiments in this regard have not been great; adding just 
one single DLE using bsdtcp has caused all other DLE's to have problems.
You can't change only one dle, you must change all dles for that client, 
you must also change inetd/xinetd on that client, don't change it on the 
server.




Re: intermittent amanda failure

2010-01-06 Thread Dustin J. Mitchell
On Wed, Jan 6, 2010 at 4:01 PM, Steve Wray steve.w...@cwa.co.nz wrote:
 Am I to understand that there could be a problem in having 'too many' DLE's
 for bsd or bsdudp to cope with?

 I never thought of there being a limit to the number of DLE's before... Our
 disklist file has 178.

Yes, it's quite possible, and quite common with folks who have an
Amanda built with only a small range of available ports.

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Dustin J. Mitchell wrote:

On Wed, Jan 6, 2010 at 4:01 PM, Steve Wray steve.w...@cwa.co.nz wrote:

Am I to understand that there could be a problem in having 'too many' DLE's
for bsd or bsdudp to cope with?

I never thought of there being a limit to the number of DLE's before... Our
disklist file has 178.


Yes, it's quite possible, and quite common with folks who have an
Amanda built with only a small range of available ports.


Ok I'm not sure about amanda as built for Debian but would it be safer for 
me to change pretty much our entire system to use bsdtcp?


So far my experiments in this regard have not been great; adding just one 
single DLE using bsdtcp has caused all other DLE's to have problems.


And intermittent problems at that

Sometimes 'amcheck -c' will report an error like:

ERROR: NAK amanda client: user root from amanda server is not allowed 
to execute the service noop: Please add amdump to the line in 
/var/backups/.amandahosts on the client


and the next time I run amcheck this error is not reported. Might be fine 
another run and then bad later.


Whereas the amanda client which I configured for bsdtcp -- the only one I 
have thus configured so far -- does not return any error.


(and yes, I am running amcheck from a 'su - backup' session where backup is 
the user that amanda runs as).





--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Jean-Louis Martineau wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just one 
dle using bsdtcp auth? That they would all have to have it? (ie the 
inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.


We are going around in circles a little here.

Allow me to try to make things very clear.

In my amanda.conf I have a dumptype defined as such:

(I've included the parent dumptypes. The 'global' dumptype is empty)

define dumptype root-tar {
global
program GNUTAR
comment root partitions dumped with tar
compress none
index
exclude list /etc/amanda/exclude.gtar
priority low
}

define dumptype nocomp-root-tar {
root-tar
comment Root partitions without compression
compress none
}

define dumptype problem-nocomp-root-tar {
nocomp-root-tar
comment Root partitions without compression, problem client
compress none
auth bsdudp
#auth bsdtcp
}

There are several DLEs for clients using the 'nocomp-root-tar' dumptype and 
only *one* DLE for *one* client using the 'problem-nocomp-root-tar' dumptype.


With the bsdudp line uncommented everything is happy with an amcheck.

With the bsdtcp line uncommented (and the bsdudp line commented out) *no* 
client is happy with the amcheck *other* than the client which uses 
'problem-nocomp-root-tar'. However, as noted in another email this is 
intermittent, sometimes some clients using nocomp-root-tar are happy. So 
far its not exhibiting much pattern that I can see.


The above *does* include the fact that I *do* change the inetd.conf on the 
client which uses problem-nocomp-root-tar *and* restart inetd.


So, with a change of one line in a dumptype in a DLE used by one client, 
all other clients have problems.



Perhaps I am misunderstanding something basic about dumptype configuration?



--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Jean-Louis Martineau
Run 'amadmin CONFIG disklist' and check the auth is set as expected 
for all dles.


Jean-Louis

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 
2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just one 
dle using bsdtcp auth? That they would all have to have it? (ie the 
inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.


We are going around in circles a little here.

Allow me to try to make things very clear.

In my amanda.conf I have a dumptype defined as such:

(I've included the parent dumptypes. The 'global' dumptype is empty)

define dumptype root-tar {
global
program GNUTAR
comment root partitions dumped with tar
compress none
index
exclude list /etc/amanda/exclude.gtar
priority low
}

define dumptype nocomp-root-tar {
root-tar
comment Root partitions without compression
compress none
}

define dumptype problem-nocomp-root-tar {
nocomp-root-tar
comment Root partitions without compression, problem client
compress none
auth bsdudp
#auth bsdtcp
}

There are several DLEs for clients using the 'nocomp-root-tar' 
dumptype and only *one* DLE for *one* client using the 
'problem-nocomp-root-tar' dumptype.


With the bsdudp line uncommented everything is happy with an amcheck.

With the bsdtcp line uncommented (and the bsdudp line commented out) 
*no* client is happy with the amcheck *other* than the client which 
uses 'problem-nocomp-root-tar'. However, as noted in another email 
this is intermittent, sometimes some clients using nocomp-root-tar are 
happy. So far its not exhibiting much pattern that I can see.


The above *does* include the fact that I *do* change the inetd.conf on 
the client which uses problem-nocomp-root-tar *and* restart inetd.


So, with a change of one line in a dumptype in a DLE used by one 
client, all other clients have problems.



Perhaps I am misunderstanding something basic about dumptype 
configuration?








Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

Jean-Louis Martineau wrote:
Run 'amadmin CONFIG disklist' and check the auth is set as expected 
for all dles.


I've done this, with the amanda.conf having bsdudp and with it having 
bsdtcp for that entry.


In both cases all auth entries for all other DLE's are 'BSD'.

In both cases only that one DLE is reported as having either bsdtcp or 
bsdudp, in both cases matching what is in the amanda.conf


So I'd say that was all as expected.




Jean-Louis

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 
2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just one 
dle using bsdtcp auth? That they would all have to have it? (ie the 
inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.


We are going around in circles a little here.

Allow me to try to make things very clear.

In my amanda.conf I have a dumptype defined as such:

(I've included the parent dumptypes. The 'global' dumptype is empty)

define dumptype root-tar {
global
program GNUTAR
comment root partitions dumped with tar
compress none
index
exclude list /etc/amanda/exclude.gtar
priority low
}

define dumptype nocomp-root-tar {
root-tar
comment Root partitions without compression
compress none
}

define dumptype problem-nocomp-root-tar {
nocomp-root-tar
comment Root partitions without compression, problem client
compress none
auth bsdudp
#auth bsdtcp
}

There are several DLEs for clients using the 'nocomp-root-tar' 
dumptype and only *one* DLE for *one* client using the 
'problem-nocomp-root-tar' dumptype.


With the bsdudp line uncommented everything is happy with an amcheck.

With the bsdtcp line uncommented (and the bsdudp line commented out) 
*no* client is happy with the amcheck *other* than the client which 
uses 'problem-nocomp-root-tar'. However, as noted in another email 
this is intermittent, sometimes some clients using nocomp-root-tar are 
happy. So far its not exhibiting much pattern that I can see.


The above *does* include the fact that I *do* change the inetd.conf on 
the client which uses problem-nocomp-root-tar *and* restart inetd.


So, with a change of one line in a dumptype in a DLE used by one 
client, all other clients have problems.



Perhaps I am misunderstanding something basic about dumptype 
configuration?









--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.


Re: intermittent amanda failure

2010-01-06 Thread Steve Wray

I'll attach some debug logs.

For the purposes of this test I cut the disklist file down to two entries:

One entry is a client which is configured to use simple BSD auth.

The other entry is a client which is configured to use bsdtcp auth.

Both of these have been verified by running amcheck disklist.


This file:
amcheck.20100107131821.debug
is from the client whose DLE has it using BSD auth.

This file:
amandad.20100107131826.debug
is from the client whose DLE has it using bsdtcp auth.


This file:
amcheck.20100107131821.debug
is from the server on that same run.

The amcheck -c command reported:

bac...@fileserver:~$ amcheck -c cwa-lto

Amanda Backup Client Hosts Check

ERROR: NAK zimbra1.internal.cwa.co.nz: user root from 
fileserver.internal.cwa.co.nz is not allowed to execute the service noop: 
Please add amdump to the line in /var/backups/.amandahosts on the client

Client check: 2 hosts checked in 5.097 seconds, 1 problem found

(brought to you by Amanda 2.5.2p1)




Steve Wray wrote:

Jean-Louis Martineau wrote:
Run 'amadmin CONFIG disklist' and check the auth is set as expected 
for all dles.


I've done this, with the amanda.conf having bsdudp and with it having 
bsdtcp for that entry.


In both cases all auth entries for all other DLE's are 'BSD'.

In both cases only that one DLE is reported as having either bsdtcp or 
bsdudp, in both cases matching what is in the amanda.conf


So I'd say that was all as expected.




Jean-Louis

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:

Jean-Louis Martineau wrote:

Steve Wray wrote:


On the client, in the sendbackup.20100106012630.debug log I see:

sendbackup-gnutar: time 0.056: /usr/lib/amanda/runtar: pid 3348
sendbackup: time 0.057: started backup
sendbackup: time 90.352: index tee cannot write [Broken pipe]
sendbackup: time 90.352: pid 3346 finish time Wed Jan  6 01:28:01 
2010

90 seconds, it's not a dtimeout issue.

Post all debug files for the run.
You can also try the bsdtcp auth, it is more firewall friendly.


Ah hang on, am I right in understanding that you can't have just 
one dle using bsdtcp auth? That they would all have to have it? (ie 
the inetd configuration)

All dles for a client must have the same auth.
different client can have different auth.


We are going around in circles a little here.

Allow me to try to make things very clear.

In my amanda.conf I have a dumptype defined as such:

(I've included the parent dumptypes. The 'global' dumptype is empty)

define dumptype root-tar {
global
program GNUTAR
comment root partitions dumped with tar
compress none
index
exclude list /etc/amanda/exclude.gtar
priority low
}

define dumptype nocomp-root-tar {
root-tar
comment Root partitions without compression
compress none
}

define dumptype problem-nocomp-root-tar {
nocomp-root-tar
comment Root partitions without compression, problem client
compress none
auth bsdudp
#auth bsdtcp
}

There are several DLEs for clients using the 'nocomp-root-tar' 
dumptype and only *one* DLE for *one* client using the 
'problem-nocomp-root-tar' dumptype.


With the bsdudp line uncommented everything is happy with an amcheck.

With the bsdtcp line uncommented (and the bsdudp line commented out) 
*no* client is happy with the amcheck *other* than the client which 
uses 'problem-nocomp-root-tar'. However, as noted in another email 
this is intermittent, sometimes some clients using nocomp-root-tar 
are happy. So far its not exhibiting much pattern that I can see.


The above *does* include the fact that I *do* change the inetd.conf 
on the client which uses problem-nocomp-root-tar *and* restart inetd.


So, with a change of one line in a dumptype in a DLE used by one 
client, all other clients have problems.



Perhaps I am misunderstanding something basic about dumptype 
configuration?












--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.
amandad: debug 1 pid 5084 ruid 34 euid 34: start at Thu Jan  7 13:18:21 2010
Could not open conf file /etc/amanda/amanda-client.conf: No such file or 
directory
amandad: time 0.000: security_getdriver(name=bsd) returns 0xb7f834c0
amandad: version 2.5.2p1
amandad: time 0.000: build: VERSION=Amanda-2.5.2p1
amandad: time 0.000:BUILT_DATE=Sat Aug 16 16:06:29 ART 2008
amandad: time 0.000:BUILT_MACH=Linux rover 2.6.25.15 #4 SMP Thu Aug 7 
11:07:30 MDT 2008 i686 GNU/Linux
amandad: time 0.000:CC=gcc
amandad: time 0.000:CONFIGURE_COMMAND='./configure' '--prefix=/usr' 

Re: intermittent amanda failure

2010-01-05 Thread Dustin J. Mitchell
I suspect an estimate or data timeout.  Have you tried increasing
dtimeout and etimeout?

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com


Re: intermittent amanda failure

2010-01-05 Thread Steve Wray

Dustin J. Mitchell wrote:

I suspect an estimate or data timeout.  Have you tried increasing
dtimeout and etimeout?



etimeout 2000
dtimeout 2000

I'd be surprised. These seem like fairly substantial values. 2000 seconds 
is roughly half an hour. I'll increase them by another 1000 seconds though, 
to see what happens.




--
Please remember that an email is just like a postcard; it is not 
confidential nor private nor secure and can be read by many other people 
than the intended recipient. A postcard can be read by anyone at the mail 
sorting office and expecting what is written on it to be private and secret 
is not realistic. Please hold no higher expectation of email.


If you need to send confidential information in an email you need to use 
encryption. PGP is Pretty good for this.