Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-27 Thread Steven Schoch
Problem found and fixed!
The problem was that when I ran amcheck (and amdump) manually it worked to a 
client machine, but when run from crontab it failed.

The problem was caused when we upgraded our main server, which is the tape 
server, to a new machine.  As part of this upgrade, we gave the old machine 
a new IP address but left it on the network.  What was happening is that the 
old amanda machine was running amcheck and amdump at about the same time as 
the new server.  However, there was some sort of problem when the old server 
ran amcheck that caused the client's inetd service to say:

May 27 16:19:47 marge inetd[94728]: amanda/udp server failing (looping), 
service terminated

Then, of course, the new server could no longer connect.
The solution was to remove the crontab entries on the old machine.
--
Steven Schoch
_
FREE pop-up blocking with the new MSN Toolbar – get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/



Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-24 Thread Steven Schoch
Now I suspect this is not an amanda problem.
As a test, I added two more lines to amanda's crontab file.  Now it says:
0 16 * * 1-5/usr/sbin/amcheck -m KBack
47 14 * * 1-5   /usr/sbin/amcheck -m KBack
49 14 * * 1-5   /usr/sbin/amcheck -m KBack
45 0 * * 2-6/usr/sbin/amdump KBack
I get three "amcheck" email messages.  The third (at 4 p.m.) says:
WARNING: marge: selfcheck request timed out.  Host down?
But the other two do not.  On marge, the latest amandad.*.debug and 
selfcheck.*.debug are dated 14:49.  So it appears that something is 
happening at about 4 p.m. that is interferring with the communication with 
marge.  Or maybe it's happening on the top of every hour?  I'll try changing 
it to 4:05 p.m. and see if that makes a difference.

--
Steven Schoch
_
Stop worrying about overloading your inbox - get MSN Hotmail Extra Storage! 
http://join.msn.click-url.com/go/onm00200362ave/direct/01/



Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Gene Heskett
On Thursday 20 May 2004 17:17, Paul Bijnens wrote:
>Gene Heskett wrote:
>> On Thursday 20 May 2004 14:10, Steven Schoch wrote:
>>>./configure --without-server --with-group=amanda
>>> --with-user=amanda --with-amandahosts
>>
>> This is the wrong group, the group may be bin, backup or disk, but
>> it should have essentially root rights to the system. A gid <=10
>> is desireable.
>
>But it does work if you use gnutar instead of dump, or if
>you add amanda to group disk as additional group (and maybe add
>"groups = yes" in xinetd.conf)
>
>Maybe that is indeed the cause of the problems, but I don't
>understand how running it manually on the server is different from
>crontab. In both cases is the client invoked through xinetd.

One thing not spec'd here Paul, is who ran it, in both cases.  Any 
test runs should be done as the user amanda, and the crontab entry 
should be in the crontab for the user amanda.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.22% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.


Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Jon LaBadie
On Thu, May 20, 2004 at 01:58:34PM -0700, Steven Schoch wrote:
> 
> 
> Here's some new information:  I ran amdump on the command line, after "su 
> amanda".  It sucessfully connected to marge (although it failed with "can't 
> switch to incremental dump" as expected).  The exact same command fails 
> when run from cron.  This leads me to believe that either:
> 
> 1.  It's a timing issue; or
> 2.  There is something different in the environment when run from cron.


Indeed there very likely are significant environment differences.
Cron jobs are run without .profile/.*rc file processing, with
/bin/sh, even if that is not the users login shell, and with a
minimal PATH setting.

There has been at least one instance on the list of multiple
installations of amanda.  Login environment executed one,
unexpectedly, cron jobs executed another.

-- 
Jon H. LaBadie  [EMAIL PROTECTED]
 JG Computing
 4455 Province Line Road(609) 252-0159
 Princeton, NJ  08540-4322  (609) 683-7220 (fax)


Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Eric Siegerman
On Thu, May 20, 2004 at 04:47:32PM -0400, Gene Heskett wrote:
> On Thursday 20 May 2004 14:10, Steven Schoch wrote:
> > The client (named "marge") is
> > running FreeBSD 4.7, running amanda 2.4.4p2 compiled from source
> >
> >./configure --without-server --with-group=amanda --with-user=amanda
> >--with-amandahosts
> 
> This is the wrong group

Agreed.  For our FreeBSD systems it's group "operator".


> the group may be bin, backup or disk, but it 
> should have essentially root rights to the system.

I strongly disagree!  Maybe "backup" or "disk", depending on the
box, but NOT "bin" (I posted on this a couple of days ago), and
NOT "essentially root rights".

The only special privilege the Amanda client needs is to read
data off the disks.  Precisely what privilege that consists of
depends on the circumstances:
  - For gtar, you need root.  More specifically, gtar itself
needs root, but the rest of the Amanda process tree doesn't.
Amanda provides the runtar wrapper program, which is
installed setuid root, to give gtar the privilege it needs.
Thus, Amanda can run under any user+group combination at all.

Safety suggests creating a user and group specifically for
Amanda, especially because that's the best way to keep
ordinary users from being able to use runtar for their own
nefarious purposes (this works because runtar is installed
as:
-rwsr-x---1 root XXX 36615 Jun 25  2003 runtar
where XXX is the group specified to "--with-group", i.e.
without world execute permission).

On the system where we use gtar, I made the obvious choice:
"--with-group=amanda".

  - On some systems, dump (or the local equivalent -- ufsdump,
vfxdump, or whatever) also needs root.  But again, Amanda
provides a setuid-root wrapper, rundump, to accomplish that,
so all of the above applies to this case too.  (rundump is
only installed on systems whose dump requires it, or you can
force the issue by passing "--with-rundump" to configure.)

  - On other systems, including FreeBSD, all dump needs is read
(but NOT write) permission on the disk-device special files.
One of these, chosen at random from one of our FreeBSD boxen,
looks like this:
crw-r-  2 root  operator  116,  27 Apr 28  2003 /dev/rad3d
Thus, building Amanda with "--with-group=operator" is all
that's needed.  On this box, the "operator" group doesn't
have access to very much else at all.  If it had, I'd
probably have created a new group for Amanda and chgrp'ed the
special files in question.

On our Solaris boxes, the group with read-only access to the
disk special files is "sys".  So that's what I built Amanda
with for Solaris.  To be honest, I no longer remember whether
either Solaris or FreeBSD sets up disk-device special files
this way "out of the box", or whether I did it myself.


> A gid <=10 is 
> desireable.

This doesn't seem relevent one way or the other.  No gid's have
any special privilege as far as the kernel is concerned.  (And
even for uid's, <=10 has no significance -- only ==0 is magic.)

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.[EMAIL PROTECTED]
|  |  /
It must be said that they would have sounded better if the singer
wouldn't throw his fellow band members to the ground and toss the
drum kit around during songs.
- Patrick Lenneau


Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Steven Schoch
Paul Bijnens wrote:
Gene Heskett wrote:
> This is the wrong group, the group may be bin, backup or disk, but it
> should have essentially root rights to the system. A gid <=10 is
> desireable.
But it does work if you use gnutar instead of dump, or if
you add amanda to group disk as additional group (and maybe add
"groups = yes" in xinetd.conf)
Actually, on a FreeBSD system there usually isn't a group called "disk".  
There is, however a group called "operator" that has read permission on the 
disks.  In this case, inetd (FreeBSD does not nomally use xinetd) was adding 
the operator group as specified in /etc/group, so it worked.

Maybe that is indeed the cause of the problems, but I don't
understand how running it manually on the server is different from
crontab. In both cases is the client invoked through xinetd.
You're right.  It makes no difference.  When it's run from the tape server's 
crontab, it appears that amandad doesn't even get invoked.  I'm still 
looking for the difference.

By the way, on RedHat systems, the group disk has read AND WRITE permissions 
on the disk devices.  This is not only unnecessary, but a bad idea as well.  
Read permissions are sufficient to perform a dump.

--
Steven Schoch
_
FREE pop-up blocking with the new MSN Toolbar – get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/



Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Paul Bijnens
Steven Schoch wrote:
Paul Bijnens wrote:
But there should be a amcheck.*.debug file on the server.
What's in there?

There is no such file on the server.  All I have are amdump.1 (and .2 
and .3) and log.20040520.0 (and other dates).  The only error I see 
there is:

ERROR planner Request to marge timed out.
amdump.* and log.* files are in a subdir of the home dir of amanda,
but amcheck.*debug should be in /tmp/amanda on the server too.
The fact that on the client you don't find any debug files,
when run from crontab, seems to indicate that the request packets
never reached the client, or got stuck in some firewall or so.
Could you find out with tcpdump, ethereal or similar programs
if any packets do arrive at the client?
In looking closer at the amcheck message, I notice its timestamp is 
16:00:30, exactly 30 seconds after amcheck is started by cron.  Looking 
at the source, 30 seconds is ACK_WAIT * ACK_TRIES.  So amcheck is 
(trying to) send the REQ packet 3 times, but getting no reply.  The logs 
on marge say the packet was not received.
Indeed, except that i would say the *absence* of logs on marge say
the packet was not received. And that's a weak indication only.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Paul Bijnens
Gene Heskett wrote:
On Thursday 20 May 2004 14:10, Steven Schoch wrote:
./configure --without-server --with-group=amanda --with-user=amanda
--with-amandahosts

This is the wrong group, the group may be bin, backup or disk, but it 
should have essentially root rights to the system. A gid <=10 is 
desireable.
But it does work if you use gnutar instead of dump, or if
you add amanda to group disk as additional group (and maybe add
"groups = yes" in xinetd.conf)
Maybe that is indeed the cause of the problems, but I don't
understand how running it manually on the server is different from
crontab. In both cases is the client invoked through xinetd.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Steven Schoch
Paul Bijnens wrote:
Steven Schoch wrote:
 marge  /var RESULTS MISSING
Is /var the only entry in the disklist for host marge?
No, I only pasted that as an example.  There are actually three: /, /var, 
and /usr.  Each fails in the same way.

In these cron cases, no files are created in /tmp/amanda on marge.
But there should be a amcheck.*.debug file on the server.
What's in there?
There is no such file on the server.  All I have are amdump.1 (and .2 and 
.3) and log.20040520.0 (and other dates).  The only error I see there is:

ERROR planner Request to marge timed out.
Here's some new information:  I ran amdump on the command line, after "su 
amanda".  It sucessfully connected to marge (although it failed with "can't 
switch to incremental dump" as expected).  The exact same command fails when 
run from cron.  This leads me to believe that either:

1.  It's a timing issue; or
2.  There is something different in the environment when run from cron.
I would think that the timing is not it, since cron runs amcheck at 4 p.m. 
and amdump at 12:45 a.m., but they both fail to connect to marge.

In looking closer at the amcheck message, I notice its timestamp is 
16:00:30, exactly 30 seconds after amcheck is started by cron.  Looking at 
the source, 30 seconds is ACK_WAIT * ACK_TRIES.  So amcheck is (trying to) 
send the REQ packet 3 times, but getting no reply.  The logs on marge say 
the packet was not received.

--
Steven Schoch
_
MSN Toolbar provides one-click access to Hotmail from any Web page – FREE 
download! http://toolbar.msn.click-url.com/go/onm00200413ave/direct/01/



Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Gene Heskett
On Thursday 20 May 2004 14:10, Steven Schoch wrote:
>The machines in question are both on a 10Mb hub.  The tape server is
> a new RedHat Enterprise Linux running amanda-2.4.4p1-0.3E direct
> from the RedHat distribution.  The client (named "marge") is
> running FreeBSD 4.7, running amanda 2.4.4p2 compiled from source
> that I downloaded yesterday in an attempt to solve the problem.  It
> is compiled with:
>
>./configure --without-server --with-group=amanda --with-user=amanda
>--with-amandahosts

This is the wrong group, the group may be bin, backup or disk, but it 
should have essentially root rights to the system. A gid <=10 is 
desireable.
>
>When I run amcheck (as amanda) manually on the RedHat machine, it
> reports no problems with the FreeBSD machine.  The FreeBSD machine
> (marge) creates debug files in /tmp/amanda and everything seems to
> be working fine.
>
>However, when amcheck is run from cron, as amanda, with the
> following line:
>
>0 16 * * 1-5/usr/sbin/amcheck -m KBack
>
>... it reports, every day:
>
>WARNING: marge: selfcheck request timed out.  Host down?
>
>Also, the amdump command, run from crontab as this:
>
>45 0 * * 2-6/usr/sbin/amdump KBack
>
>also fails with
>
>  marge  /var RESULTS MISSING
>
>In these cron cases, no files are created in /tmp/amanda on marge.
>
>
>So why does it work in a terminal but fail when run from cron?
>
>By the way, the RedHat tape server is new, but amanda on marge had
> been working fine with an older RedHat system earlier.
>
>--
>Steven Schoch
>
>_
>FREE pop-up blocking with the new MSN Toolbar – get it now!
>http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.22% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.


Re: amcheck/amdump fails (sometimes) to amanda client

2004-05-20 Thread Paul Bijnens
Steven Schoch wrote:
 marge  /var RESULTS MISSING
Is /var the only entry in the disklist for host marge?

In these cron cases, no files are created in /tmp/amanda on marge.
But there should be a amcheck.*.debug file on the server.
What's in there?
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***