Re: NAK: amandad busy
On Tue, Feb 18, 2003 at 04:24:24PM -0500, Shawn Sanders wrote: I am adding a new system to our amanda backup system. The system has a LARGE filesystem so I have to use tar to divide it up some. amcheck runs fine, but when amdump runs I get the following error in the email report : Request to hostname timed out The problematic client (Intel Solaris 8) gets this in the amandad.debug Amanda 2.4 NAK HANDLE 000-D0A00708 SEQ 1045598401 ERROR amandad busy The server is running Redhat 7.2. The two directories I am backing up are 17/18GB. The tar seems to be taking several hours to complete. Do I need to adjust a wait state or perhaps I am missing something else. Please check the sendsize debug files and find out how long estimates are taking. If you are using an older version of GNUtar, estimates are extremely slow. I recommend upgrading to 1.13.25 I would increase etimeout and dtimeout. From man amanda etimeout int Default: 300 seconds. Amount of time per disk on a given client that the planner step of amdump will wait to get the dump size estimates. For instance, with the default of 300 seconds and four disks on client A, planner will wait up to 20 minutes for that machine. A negative value will be interpreted as a total amount of time, instead of a per-disk value. dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. Ciao Dietmar -- Alles Gute / best wishes Dietmar Goldbeck E-Mail: [EMAIL PROTECTED] Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western Civilization? Gandhi: I think it would be a good idea.
RE: NAK: amandad busy
Thanks for the direction. The sendsize.debug shows it starting at = sendsize: debug 1 pid 25350 ruid 1026 euid 1026 start time Tue Feb 18 23:11:08 2 003 /usr/local/libexec/sendsize: version 2.4.2p1 calculating for amname '/directory', dirname '/directory' sendsize: getting size via gnutar for /directory level 0 sendsize: missing exclude list file /usr/local/lib/amanda/exclude.gtar discard ed sendsize: running /usr/local/libexec/runtar --create --directory /cmroot/base/r cs --listed-incremental /usr/local/var/amanda/gnutar-lists/pitbullcm-lab.tcs-sec .com_cmroot_base_rcs_0.new --sparse --one-file-system --ignore-failed-read - -tot als --file /dev/null sendsize: spawning /usr/local/libexec/runtar in pipeline sendsize: argument list: /usr/local/bin/tar --create --directory /directory --listed-incremental /usr/local/var/amanda/gnutar-lists/hostname-l ab.domainname_directory_0.new --sparse --one-file-system --ignore-f ailed-read --totals --file /dev/null . Total bytes written: 19815997440 (18GB, 2.0MB/s) . sendsize: pid 25350 finish time Wed Feb 19 01:49:25 2003 == So that took about 2 hours 40 minutes. Would that require the etimeout to be upped to ~9000? Currently the etimeout and the dtimeout are at the defaults, 300 and 1800. You mentioned I should look into a newer tar version. Currently I have 1.13.19. I have not found a Solaris source for 1.13.25 as of yet. Maybe that would speed things up. The directory does have about 6000+ files in it. Thanks for the feedback, Shawn -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Dietmar Goldbeck Sent: Wednesday, February 19, 2003 3:43 AM To: Shawn Sanders Cc: [EMAIL PROTECTED] Subject: Re: NAK: amandad busy On Tue, Feb 18, 2003 at 04:24:24PM -0500, Shawn Sanders wrote: I am adding a new system to our amanda backup system. The system has a LARGE filesystem so I have to use tar to divide it up some. amcheck runs fine, but when amdump runs I get the following error in the email report : Request to hostname timed out The problematic client (Intel Solaris 8) gets this in the amandad.debug Amanda 2.4 NAK HANDLE 000-D0A00708 SEQ 1045598401 ERROR amandad busy The server is running Redhat 7.2. The two directories I am backing up are 17/18GB. The tar seems to be taking several hours to complete. Do I need to adjust a wait state or perhaps I am missing something else. Please check the sendsize debug files and find out how long estimates are taking. If you are using an older version of GNUtar, estimates are extremely slow. I recommend upgrading to 1.13.25 I would increase etimeout and dtimeout. From man amanda etimeout int Default: 300 seconds. Amount of time per disk on a given client that the planner step of amdump will wait to get the dump size estimates. For instance, with the default of 300 seconds and four disks on client A, planner will wait up to 20 minutes for that machine. A negative value will be interpreted as a total amount of time, instead of a per-disk value. dtimeout int Default: 1800 seconds. Amount of idle time per disk on a given client that a dumper running from within amdump will wait before it fails with a data timeout error. Ciao Dietmar -- Alles Gute / best wishes Dietmar Goldbeck E-Mail: [EMAIL PROTECTED] Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western Civilization? Gandhi: I think it would be a good idea.
Re: NAK: amandad busy
On Wed, Feb 19, 2003 at 12:58:04PM -0500, Shawn Sanders wrote: So that took about 2 hours 40 minutes. Would that require the etimeout to be upped to ~9000? The timeout gets multiplied with the number of filesystems and levels. I would set it to an hour for the next run. Currently the etimeout and the dtimeout are at the defaults, 300 and 1800. You should also increase dtimeout, since tar might take more than half an hour before it starts writing output data. An hour might be a good start too. You mentioned I should look into a newer tar version. Currently I have 1.13.19. I have not found a Solaris source for 1.13.25 as of yet. Maybe that would speed things up. IIRC the main speedup was between 1.12 and 1.13. You can find the new one under http://alpha.gnu.org/gnu/tar/ Ciao Dietmar -- Alles Gute / best wishes Dietmar Goldbeck E-Mail: [EMAIL PROTECTED] Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western Civilization? Gandhi: I think it would be a good idea.
NAK: amandad busy
I am adding a new system to our amanda backup system. The system has a LARGE filesystem so I have to use tar to divide it up some. amcheck runs fine, but when amdump runs I get the following error in the email report : Request to hostname timed out The problematic client (Intel Solaris 8) gets this in the amandad.debug Amanda 2.4 NAK HANDLE 000-D0A00708 SEQ 1045598401 ERROR amandad busy The server is running Redhat 7.2. The two directories I am backing up are 17/18GB. The tar seems to be taking several hours to complete. Do I need to adjust a wait state or perhaps I am missing something else. Any help would be greatly appreciated. Shawn
Re: NAK: amandad busy
On Mon, 10 Feb 2003 at 5:06pm, justin m. clayton wrote I've been receiving this error on some, but not always all, of my hosts during amcheck. What could be causing this issue? Which logs are most likely to be housing the magic info I need to solve this? If there's already an amandad running (i.e. one from a previous run that never died), you'll get this error. Look in /tmp/amanda/amandad*debug. Also look at the output of 'ps' when you get the error. You don't mention your OS, but I saw this problem with some of the recent RedHat errata kernels and amanda 2.4.2p2. Upgrading to 2.4.3 solved it. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University
Re: NAK: amandad busy
On Tue, 11 Feb 2003, Joshua Baker-LePain wrote: On Mon, 10 Feb 2003 at 5:06pm, justin m. clayton wrote I've been receiving this error on some, but not always all, of my hosts during amcheck. What could be causing this issue? Which logs are most likely to be housing the magic info I need to solve this? If there's already an amandad running (i.e. one from a previous run that never died), you'll get this error. Look in /tmp/amanda/amandad*debug. Also look at the output of 'ps' when you get the error. The other way this can happen is if you have multiple names in your disklist that refer to the same client machine. Amanda will consider them unique due to their names being different and try to contact each name separately, but the amanda client is not multi-threaded and will NAK the 2nd connection attempt if the 1st is still running. This should be a FAQ if it isn't already. -Mitch
NAK: amandad busy
I've been receiving this error on some, but not always all, of my hosts during amcheck. What could be causing this issue? Which logs are most likely to be housing the magic info I need to solve this? Thanks, Justin Clayton VLSI Research System Administrator University of Washington Electrical Engineering Dept [EMAIL PROTECTED] 206/543.2523 EE/CSE 307E
obsul40 /export/diskA1 lev 0 FAILED [obsul40 NAK: amandad busy]
hi, I start having this probleme only in one machine when it used to work for a while without having any problems. I noticed that some deamon was still working on this machine ps -ef | grep am root 12346 168 0 00:45:00 ?0:00 amandad root 12347 12346 0 00:45:02 ?0:00 /unige/amanda/libexec/sendsize root 12348 12347 0 00:45:02 ?0:00 /unige/amanda/libexec/killpgrp so I gently kill them and relaunch an backupbut still... the same problem appear again any idea thanks in advance
Re: NAK: amandad busy from unfinishing selfchecks
peanut butter wrote: Hi, I'm using version 2.4.2p2 of Amanda. For a particular client on which I use tar with Amanda to back up a single directory, this entry has worked fine until I aborted an amdump several days ago (at least I think these are connected). The next amdump (or one very soon afterward) would show a no estimate with an amstatus for this machine and finally a FAILED for the email report. Amchecks started coming back with NAK: amandad busy for this machine. Investigating this, I noticed that two processes from the amanda user were running on the client, an amandad and (likely, in retro) a /usr/local/libexec/amanda/2.4.2p2/sendsize. I killed these but an amcheck only started two new ones--amandad and /usr/local/libexec/amanda/2.4.2p2/selfcheck--which would continue to run until I would kill them. One time I let them run over a night or two to see if they would ever finish or be cleaned up. Alas, it would seem they would run forever if I let them. If I kill them, an amcheck will start them again and subsequent amchecks will give me the dreaded NAK: amandad busy message. I've got a mix of 2.4.2 and 2.4.2p2 on Linux, Solaris, HP-UX and IRIX systems. I'm using varients of dump everywhere. I've never had a problem with amanda processes restarting once I got them killed. Have you run an amcleanup on your server? Is there anything left around that you need to amflush? -- [EMAIL PROTECTED] - HMC UNIX Systems Manager
Re: NAK: amandad busy from unfinishing selfchecks
-In response to your message- --received from Chris Marble-- I've got a mix of 2.4.2 and 2.4.2p2 on Linux, Solaris, HP-UX and IRIX systems. I'm using varients of dump everywhere. I've never had a problem with amanda processes restarting once I got them killed. Have you run an amcleanup on your server? Is there anything left around that you need to amflush? -- [EMAIL PROTECTED] - HMC UNIX Systems Manager Hi, and thanks for the response. It's not that the processes would restart on their own but that, once started, they wouldn't ever complete and go away (I would eventually have to kill them). It now appears that the amdump I killed several days ago, though any Amanda processes on the client were killed, somehow left the port for that client used. System messages were as such: Oct 11 10:01:01 mamacass inetd[26156]: fs/tcp: bind: Address already in use Oct 11 10:11:18 mamacass inetd[26156]: fs/tcp: bind: Address already in use Oct 11 10:21:40 mamacass inetd[26156]: fs/tcp: bind: Address already in use Oct 11 10:31:59 mamacass inetd[26156]: fs/tcp: bind: Address already in use Oct 11 10:36:39 mamacass inetd[26156]: /usr/local/libexec/amanda/2.4.2p2/amandad : Killed Oct 11 10:42:17 mamacass inetd[26156]: fs/tcp: bind: Address already in use Oct 11 10:52:38 mamacass inetd[26156]: fs/tcp: bind: Address already in use The amandad being killed was a manual kill on my part but you can see that it was trying to run in the same port that earlier (and later) messages claimed was already in use. The particular port number was unexpected as I have Amanda set up to run on the usual port numbers. One thing leading to another, to make a long story short, an eventual reboot (making a bunch of people unhappy) ended up resolving the problem. Thanks, again. Paul -- Paul Yeatman (858) 534-9896[EMAIL PROTECTED] == ==Proudly brought to you by Mutt== ==
Re: NAK amandad busy
I upgraded my backup server tpday from Amanda 2.4.1p1 to 2.4.2p2. Now when I run amcheck daily it reports the folloing errror on the amanda client which is on the backup server: ERROR: berkeley NAK: amandad busy I looked through the E-mail archive and there were a couple of possibilities. One was that you have two configurations trying to run at the same time to the same client. The other was that the same physical client was listed more than once in the disklist but by different names so Amanda thinks they are different machines when in fact they are not. If that doesn't help, please post the amandad*debug file from the client showing the problem. Paolo John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
NAK amandad busy
Hi I upgraded my backup server tpday from Amanda 2.4.1p1 to 2.4.2p2. Now when I run amcheck daily it reports the folloing errror on the amanda client which is on the backup server: ERROR: berkeley NAK: amandad busy Running amcheck daily just prior to the upgrade didn't give any error. What did the upgrade change that caused this problem? How do I fix it? TIA Paolo
NAK: amandad busy
I'm getting this error from my nightly backups: FAILURE AND STRANGE DUMP SUMMARY: tarsier/tarsier.1/www lev 0 FAILED [tarsier NAK: amandad busy] the host is a solaris machine with the Mbs/duplex hardset, plugged into a switch that has them hardset. This dir is the only dir we use gtar to backup, it's 15+GB, and I don't think I've ever had a sucessful level 0 backup of it happen. What am I doing wrong? Thanks. -- .michael lea _/_/ "when in danger or in Associate System Engineer _/_/doubt, run in circles, [EMAIL PROTECTED] _/_/ scream and shout"