Re: NAK: amandad busy

2003-02-19 Thread Dietmar Goldbeck
On Tue, Feb 18, 2003 at 04:24:24PM -0500, Shawn Sanders wrote:
 I am adding a new system to our amanda backup system.  The system has a
 LARGE filesystem so I have
 to use tar to divide it up some.  amcheck runs fine, but when amdump runs I
 get the following error
 in the email report : Request to hostname timed out
 
 The problematic client (Intel Solaris 8) gets this in the amandad.debug
 
 Amanda 2.4 NAK HANDLE 000-D0A00708 SEQ 1045598401
 ERROR amandad busy
 
 The server is running Redhat 7.2.
 
 The two directories I am backing up are 17/18GB.  The tar seems to be taking
 several hours
 to complete.  Do I need to adjust a wait state or perhaps I am missing
 something else.
 

Please check the sendsize debug files and find out how long estimates
are taking. If you are using an older version of GNUtar, estimates are
extremely slow. I recommend upgrading to 1.13.25

I would increase etimeout and dtimeout. From man amanda

  etimeout int
  Default: 300 seconds.  Amount of time per disk on a
  given client that the planner step of amdump will wait
  to get the dump size estimates.  For instance, with the
  default of 300 seconds and four disks on client A,
  planner will wait up to 20 minutes for that machine.  A
  negative value will be interpreted as a total
  amount of time, instead of a per-disk value.

   dtimeout int
  Default: 1800 seconds.  Amount of idle time per disk on
  a given client that a dumper running from within
  amdump will wait before it fails with a data timeout error.


   Ciao
 Dietmar

-- 
 Alles Gute / best wishes  
 Dietmar Goldbeck E-Mail: [EMAIL PROTECTED]
Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western
Civilization?  Gandhi: I think it would be a good idea.



RE: NAK: amandad busy

2003-02-19 Thread Shawn Sanders
Thanks for the direction. The sendsize.debug shows it starting at

=
sendsize: debug 1 pid 25350 ruid 1026 euid 1026 start time Tue Feb 18
23:11:08 2
003
/usr/local/libexec/sendsize: version 2.4.2p1
calculating for amname '/directory', dirname '/directory'
sendsize: getting size via gnutar for /directory level 0
sendsize: missing exclude list file /usr/local/lib/amanda/exclude.gtar
discard
ed
sendsize: running /usr/local/libexec/runtar --create --directory
/cmroot/base/r
cs --listed-incremental
/usr/local/var/amanda/gnutar-lists/pitbullcm-lab.tcs-sec
.com_cmroot_base_rcs_0.new --sparse --one-file-system --ignore-failed-read -
-tot
als --file /dev/null 
sendsize: spawning /usr/local/libexec/runtar in pipeline
sendsize: argument list: /usr/local/bin/tar --create --directory
/directory --listed-incremental
/usr/local/var/amanda/gnutar-lists/hostname-l
ab.domainname_directory_0.new --sparse --one-file-system --ignore-f
ailed-read --totals --file /dev/null .
Total bytes written: 19815997440 (18GB, 2.0MB/s)
.
sendsize: pid 25350 finish time Wed Feb 19 01:49:25 2003
==

So that took about 2 hours 40 minutes.  Would that require the etimeout to
be upped to ~9000?

Currently the etimeout and the dtimeout are at the defaults, 300 and 1800.

You mentioned I should look into a newer tar version.  Currently I have
1.13.19.  I have not found
a Solaris source for 1.13.25 as of yet.  Maybe that would speed things up.
The directory
does have about 6000+ files in it.

Thanks for the feedback,
Shawn


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Dietmar Goldbeck
Sent: Wednesday, February 19, 2003 3:43 AM
To: Shawn Sanders
Cc: [EMAIL PROTECTED]
Subject: Re: NAK: amandad busy


On Tue, Feb 18, 2003 at 04:24:24PM -0500, Shawn Sanders wrote:
 I am adding a new system to our amanda backup system.  The system has a
 LARGE filesystem so I have
 to use tar to divide it up some.  amcheck runs fine, but when amdump runs
I
 get the following error
 in the email report : Request to hostname timed out

 The problematic client (Intel Solaris 8) gets this in the amandad.debug

 Amanda 2.4 NAK HANDLE 000-D0A00708 SEQ 1045598401
 ERROR amandad busy

 The server is running Redhat 7.2.

 The two directories I am backing up are 17/18GB.  The tar seems to be
taking
 several hours
 to complete.  Do I need to adjust a wait state or perhaps I am missing
 something else.


Please check the sendsize debug files and find out how long estimates
are taking. If you are using an older version of GNUtar, estimates are
extremely slow. I recommend upgrading to 1.13.25

I would increase etimeout and dtimeout. From man amanda

  etimeout int
  Default: 300 seconds.  Amount of time per disk on a
  given client that the planner step of amdump will wait
  to get the dump size estimates.  For instance, with the
  default of 300 seconds and four disks on client A,
  planner will wait up to 20 minutes for that machine.  A
  negative value will be interpreted as a total
  amount of time, instead of a per-disk value.

   dtimeout int
  Default: 1800 seconds.  Amount of idle time per disk on
  a given client that a dumper running from within
  amdump will wait before it fails with a data timeout error.


   Ciao
 Dietmar

--
 Alles Gute / best wishes
 Dietmar Goldbeck E-Mail: [EMAIL PROTECTED]
Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western
Civilization?  Gandhi: I think it would be a good idea.




Re: NAK: amandad busy

2003-02-19 Thread Dietmar Goldbeck
On Wed, Feb 19, 2003 at 12:58:04PM -0500, Shawn Sanders wrote:
 
 So that took about 2 hours 40 minutes.  Would that require the etimeout to
 be upped to ~9000?
 

The timeout gets multiplied with the number of filesystems and levels.
I would set it to an hour for the next run.

 Currently the etimeout and the dtimeout are at the defaults, 300 and 1800.
 

You should also increase dtimeout, since tar might take more than half an hour
before it starts writing output data. An hour might be a good start too.

 You mentioned I should look into a newer tar version.  Currently I have
 1.13.19.  I have not found
 a Solaris source for 1.13.25 as of yet.  Maybe that would speed things up.

IIRC the main speedup was between 1.12 and 1.13.
You can find the new one under http://alpha.gnu.org/gnu/tar/

   Ciao
  Dietmar

-- 
 Alles Gute / best wishes  
 Dietmar Goldbeck E-Mail: [EMAIL PROTECTED]
Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western
Civilization?  Gandhi: I think it would be a good idea.



NAK: amandad busy

2003-02-18 Thread Shawn Sanders
I am adding a new system to our amanda backup system.  The system has a
LARGE filesystem so I have
to use tar to divide it up some.  amcheck runs fine, but when amdump runs I
get the following error
in the email report : Request to hostname timed out

The problematic client (Intel Solaris 8) gets this in the amandad.debug

Amanda 2.4 NAK HANDLE 000-D0A00708 SEQ 1045598401
ERROR amandad busy

The server is running Redhat 7.2.

The two directories I am backing up are 17/18GB.  The tar seems to be taking
several hours
to complete.  Do I need to adjust a wait state or perhaps I am missing
something else.

Any help would be greatly appreciated.

Shawn




Re: NAK: amandad busy

2003-02-11 Thread Joshua Baker-LePain
On Mon, 10 Feb 2003 at 5:06pm, justin m. clayton wrote

 I've been receiving this error on some, but not always all, of my hosts
 during amcheck. What could be causing this issue? Which logs are most
 likely to be housing the magic info I need to solve this?

If there's already an amandad running (i.e. one from a previous run that 
never died), you'll get this error.  Look in /tmp/amanda/amandad*debug.  
Also look at the output of 'ps' when you get the error.

You don't mention your OS, but I saw this problem with some of the recent 
RedHat errata kernels and amanda 2.4.2p2.  Upgrading to 2.4.3 solved it.

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University




Re: NAK: amandad busy

2003-02-11 Thread Mitch Collinsworth

On Tue, 11 Feb 2003, Joshua Baker-LePain wrote:

 On Mon, 10 Feb 2003 at 5:06pm, justin m. clayton wrote

  I've been receiving this error on some, but not always all, of my hosts
  during amcheck. What could be causing this issue? Which logs are most
  likely to be housing the magic info I need to solve this?

 If there's already an amandad running (i.e. one from a previous run that
 never died), you'll get this error.  Look in /tmp/amanda/amandad*debug.
 Also look at the output of 'ps' when you get the error.

The other way this can happen is if you have multiple names in your
disklist that refer to the same client machine.  Amanda will consider
them unique due to their names being different and try to contact
each name separately, but the amanda client is not multi-threaded and
will NAK the 2nd connection attempt if the 1st is still running.

This should be a FAQ if it isn't already.

-Mitch



NAK: amandad busy

2003-02-10 Thread justin m. clayton
I've been receiving this error on some, but not always all, of my hosts
during amcheck. What could be causing this issue? Which logs are most
likely to be housing the magic info I need to solve this?

Thanks,

Justin Clayton
VLSI Research System Administrator
University of Washington
Electrical Engineering Dept
[EMAIL PROTECTED]
206/543.2523  EE/CSE 307E




obsul40 /export/diskA1 lev 0 FAILED [obsul40 NAK: amandad busy]

2001-10-23 Thread BRINER Cedric

hi,

I start having this probleme only in one machine when it used to work
for a while without having any problems.
I noticed that some deamon was still working on this machine ps -ef |
grep am

root 12346   168  0 00:45:00 ?0:00 amandad
root 12347 12346  0 00:45:02 ?0:00
/unige/amanda/libexec/sendsize
root 12348 12347  0 00:45:02 ?0:00
/unige/amanda/libexec/killpgrp

so I gently kill them and relaunch an backupbut still... the same
problem appear again

any idea

thanks in advance



Re: NAK: amandad busy from unfinishing selfchecks

2001-10-12 Thread Chris Marble

peanut butter wrote:
 
 Hi, I'm using version 2.4.2p2 of Amanda.  For a particular client on
 which I use tar with Amanda to back up a single directory, this entry
 has worked fine until I aborted an amdump several days ago (at least I
 think these are connected).  The next amdump (or one very soon
 afterward) would show a no estimate with an amstatus for this machine
 and finally a FAILED for the email report.  Amchecks started coming
 back with NAK:  amandad busy for this machine.  Investigating this, I
 noticed that two processes from the amanda user were running on the
 client, an amandad and (likely, in retro) a
 /usr/local/libexec/amanda/2.4.2p2/sendsize.  I killed these but an
 amcheck only started two new ones--amandad and
 /usr/local/libexec/amanda/2.4.2p2/selfcheck--which would continue to
 run until I would kill them.  One time I let them run over a
 night or two to see if they would ever finish or be cleaned up.  Alas,
 it would seem they would run forever if I let them.  If I kill them, an
 amcheck will start them again and subsequent amchecks will give me the
 dreaded NAK:  amandad busy message.

I've got a mix of 2.4.2 and 2.4.2p2 on Linux, Solaris, HP-UX and
IRIX systems.  I'm using varients of dump everywhere.  I've never had
a problem with amanda processes restarting once I got them killed.
Have you run an amcleanup on your server?  Is there anything left
around that you need to amflush?
-- 
  [EMAIL PROTECTED] - HMC UNIX Systems Manager



Re: NAK: amandad busy from unfinishing selfchecks

2001-10-12 Thread peanut butter

-In response to your message-
  --received from Chris Marble--
 
 I've got a mix of 2.4.2 and 2.4.2p2 on Linux, Solaris, HP-UX and
 IRIX systems.  I'm using varients of dump everywhere.  I've never had
 a problem with amanda processes restarting once I got them killed.
 Have you run an amcleanup on your server?  Is there anything left
 around that you need to amflush?
 -- 
   [EMAIL PROTECTED] - HMC UNIX Systems Manager


Hi, and thanks for the response.  It's not that the processes would
restart on their own but that, once started, they wouldn't ever
complete and go away (I would eventually have to kill them).

It now appears that the amdump I killed several days ago, though
any Amanda processes on the client were killed, somehow left the port
for that client used.  System messages were as such:

Oct 11 10:01:01 mamacass inetd[26156]: fs/tcp: bind: Address already in use
Oct 11 10:11:18 mamacass inetd[26156]: fs/tcp: bind: Address already in use
Oct 11 10:21:40 mamacass inetd[26156]: fs/tcp: bind: Address already in use
Oct 11 10:31:59 mamacass inetd[26156]: fs/tcp: bind: Address already in use
Oct 11 10:36:39 mamacass inetd[26156]: /usr/local/libexec/amanda/2.4.2p2/amandad : 
Killed
Oct 11 10:42:17 mamacass inetd[26156]: fs/tcp: bind: Address already in use
Oct 11 10:52:38 mamacass inetd[26156]: fs/tcp: bind: Address already in use

The amandad being killed was a manual kill on my part but you can
see that it was trying to run in the same port that earlier (and
later) messages claimed was already in use.  The particular port
number was unexpected as I have Amanda set up to run on the usual port
numbers.

One thing leading to another, to make a long story short, an
eventual reboot (making a bunch of people unhappy) ended up
resolving the problem.  Thanks, again.

Paul



-- 
Paul Yeatman   (858) 534-9896[EMAIL PROTECTED]
 ==
 ==Proudly brought to you by Mutt==
 ==



Re: NAK amandad busy

2001-05-08 Thread John R. Jackson

   I upgraded my backup server tpday from Amanda 2.4.1p1 to 2.4.2p2. 
Now when I run amcheck daily it reports the folloing errror on the 
amanda client which is on the backup server: 
ERROR: berkeley NAK: amandad busy 

I looked through the E-mail archive and there were a couple of
possibilities.  One was that you have two configurations trying to run at
the same time to the same client.  The other was that the same physical
client was listed more than once in the disklist but by different names
so Amanda thinks they are different machines when in fact they are not.

If that doesn't help, please post the amandad*debug file from the client
showing the problem.

   Paolo

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]



NAK amandad busy

2001-05-06 Thread Paolo Supino


Hi 


   I upgraded my backup server tpday from Amanda 2.4.1p1 to 2.4.2p2. 
Now when I run amcheck daily it reports the folloing errror on the 
amanda client which is on the backup server: 
ERROR: berkeley NAK: amandad busy 

 Running amcheck daily just prior to the upgrade didn't give any error.
What did the upgrade change that caused this problem? How do I fix it? 



TIA 


Paolo



NAK: amandad busy

2001-01-09 Thread Michael Lea

I'm getting this error from my nightly backups:

FAILURE AND STRANGE DUMP SUMMARY:
  tarsier/tarsier.1/www lev 0 FAILED [tarsier NAK: amandad busy]
  

the host is a solaris machine with the Mbs/duplex hardset, plugged into 
a switch that has them hardset.  This dir is the only dir we use gtar to
backup, it's 15+GB, and I don't think I've ever had a sucessful level 0
backup of it happen.  What am I doing wrong?

Thanks.

-- 
.michael lea   _/_/  "when in danger or in 
  Associate System Engineer   _/_/doubt, run in circles, 
[EMAIL PROTECTED]   _/_/ scream and shout"