Re: FW: selfcheck hangs

2003-06-19 Thread Paul Bijnens
Steven M. Wilson wrote:
That could very well be the problem I'm having since I just tried a df 
on the client system and it ground to a halt trying to located NFS 
mounts.  We rely heavily on NFS here so I'll need to figure out how to 
get around this problem in the future.  Thanks for the info.



Mount your non-critical nfs filesystems with mount options soft,intr
instead of hard,nointr (or eventually hard,intr) to avoid the blocking.
Yes, the man page says intr just causes a lot of trouble, but hard
mounts result in non-killable processes (unless you specify intr in
which case you can kill the process, but it will not timeout by itself).
And be very careful what you categorize critical: a disk with software 
packages even including amanda, is not critical in this sense; an nfs
mounted root partition is critical.
(Yes, the man page says intr just cause a lot of trouble, but hard
mounts result in non-killable processes.  Be careful to put your 
critical nfs shares on stable hard/software that does not need frequent
reboots.)

my 0.02 

--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  Are you sure?  ...   YES   ...   Phew ...   I'm out  *
***



RE: FW: selfcheck hangs

2003-06-18 Thread Jeremy L. Mordkoff

no, the list was no help. The problem was that the client had nfs-mounted a disk that 
was no longer on the net, so anything that iterated over mounts (like df) was hanging. 
That is probably why reboot solve it. I don't allow key machines to be nfs clients 
anymore.

JLM



-Original Message-
From:   Steven M. Wilson [mailto:[EMAIL PROTECTED]
Sent:   Wed 6/18/2003 2:32 PM
To: Jeremy L. Mordkoff
Cc: 
Subject:Re: FW: selfcheck hangs
Jeremy,

Did anyone respond off-list to your posting?  I have the same problem 
here from time to time and the only way I've been able to correct is by 
rebooting the offending client system.

Steve

Jeremy L. Mordkoff wrote:

one system has started refusing to run backups. amcheck reports a timeout. A ps on 
the client shows several orphaned selfcheck's. I did try killing all amandad's and 
hitting xinetd with a sigHUP, and then I tried an amcheck again, to no avail. I then 
reinstalled amanda and repeated. Still no. Here's the debug log. 

Any ideas would be appreciated.

JLM

-Original Message-
From:  root [mailto:[EMAIL PROTECTED]
Sent:  Fri 6/13/2003 9:20 AM
To:[EMAIL PROTECTED]
Cc:
Subject:   
amandad: debug 1 pid 23823 ruid 527 euid 527: start at Fri Jun 13 09:16:52 2003
amandad: version 2.4.3
amandad: build: VERSION=Amanda-2.4.3
amandad:BUILT_DATE=Fri Apr 4 10:37:17 EST 2003
amandad:BUILT_MACH=Linux lux1 2.4.18-18.7.xsmp #1 SMP Wed Nov 13 19:01:42 
EST 2002 i686 unknown
amandad:CC=gcc
amandad:CONFIGURE_COMMAND='./configure' '--with-user=amanda' 
'--with-group=disk'
amandad: paths: bindir=/usr/local/bin sbindir=/usr/local/sbin
amandad:libexecdir=/usr/local/libexec mandir=/usr/local/man
amandad:AMANDA_TMPDIR=/tmp/amanda AMANDA_DBGDIR=/tmp/amanda
amandad:CONFIG_DIR=/usr/local/etc/amanda DEV_PREFIX=/dev/
amandad:RDEV_PREFIX=/dev/ DUMP=/sbin/dump
amandad:RESTORE=/sbin/restore SAMBA_CLIENT=/usr/bin/smbclient
amandad:GNUTAR=/bin/gtar COMPRESS_PATH=/bin/gzip
amandad:UNCOMPRESS_PATH=/bin/gzip MAILER=/usr/bin/Mail
amandad:listed_incr_dir=/usr/local/var/amanda/gnutar-lists
amandad: defs:  DEFAULT_SERVER=lux1 DEFAULT_CONFIG=DailySet1
amandad:DEFAULT_TAPE_SERVER=lux1 DEFAULT_TAPE_DEVICE=/dev/null
amandad:HAVE_MMAP HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID
amandad:DEBUG_CODE AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:CLIENT_LOGIN=amanda FORCE_USERID HAVE_GZIP
amandad:COMPRESS_SUFFIX=.gz COMPRESS_FAST_OPT=--fast
amandad:COMPRESS_BEST_OPT=--best UNCOMPRESS_OPT=-dc
amandad: time 0.000: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 0.000: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212


amandad: time 0.001: bsd security: remote host lux1 user amanda local user amanda
amandad: time 0.001: amandahosts security check passed
amandad: time 0.001: running service /usr/local/libexec/selfcheck
amandad: time 30.526: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 31.146: received dup P_REQ packet, ACKing it
amandad: time 31.146: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212


amandad: time 61.141: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 61.141: received dup P_REQ packet, ACKing it
amandad: time 61.141: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212






  


-- 
Steven M. Wilson, Systems and Network Manager
Markey Center for Structural Biology
Purdue University
[EMAIL PROTECTED]765.496.1946









Re: FW: selfcheck hangs

2003-06-18 Thread Steven M. Wilson




Jeremy,

That could very well be the problem I'm having since I just tried a df on
the client system and it ground to a halt trying to located NFS mounts. ?We
rely heavily on NFS here so I'll need to figure out how to get around this
problem in the future. ?Thanks for the info.

Steve

Jeremy L. Mordkoff wrote:

  no, the list was no help. The problem was that the client had nfs-mounted a disk that was no longer on the net, so anything that iterated over mounts (like df) was hanging. That is probably why reboot solve it. I don't allow key machines to be nfs clients anymore.

JLM



-Original Message-
From:	Steven M. Wilson [mailto:[EMAIL PROTECTED]]
Sent:	Wed 6/18/2003 2:32 PM
To:	Jeremy L. Mordkoff
Cc:	
Subject:	Re: FW: selfcheck hangs
Jeremy,

Did anyone respond off-list to your posting?  I have the same problem 
here from time to time and the only way I've been able to correct is by 
rebooting the offending client system.

Steve

Jeremy L. Mordkoff wrote:

  
  
one system has started refusing to run backups. amcheck reports a timeout. A ps on the client shows several orphaned selfcheck's. I did try killing all amandad's and hitting xinetd with a sigHUP, and then I tried an amcheck again, to no avail. I then reinstalled amanda and repeated. Still no. Here's the debug log. 

Any ideas would be appreciated.

JLM

-Original Message-
From:	root [mailto:[EMAIL PROTECTED]]
Sent:	Fri 6/13/2003 9:20 AM
To:	[EMAIL PROTECTED]
Cc:	
Subject:	
amandad: debug 1 pid 23823 ruid 527 euid 527: start at Fri Jun 13 09:16:52 2003
amandad: version 2.4.3
amandad: build: VERSION="Amanda-2.4.3"
amandad:BUILT_DATE="Fri Apr 4 10:37:17 EST 2003"
amandad:BUILT_MACH="Linux lux1 2.4.18-18.7.xsmp #1 SMP Wed Nov 13 19:01:42 EST 2002 i686 unknown"
amandad:CC="gcc"
amandad:CONFIGURE_COMMAND="'./configure' '--with-user=amanda' '--with-group=disk'"
amandad: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
amandad:libexecdir="/usr/local/libexec" mandir="/usr/local/man"
amandad:AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
amandad:CONFIG_DIR="/usr/local/etc/amanda" DEV_PREFIX="/dev/"
amandad:RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
amandad:RESTORE="/sbin/restore" SAMBA_CLIENT="/usr/bin/smbclient"
amandad:GNUTAR="/bin/gtar" COMPRESS_PATH="/bin/gzip"
amandad:UNCOMPRESS_PATH="/bin/gzip" MAILER="/usr/bin/Mail"
amandad:listed_incr_dir="/usr/local/var/amanda/gnutar-lists"
amandad: defs:  DEFAULT_SERVER="lux1" DEFAULT_CONFIG="DailySet1"
amandad:DEFAULT_TAPE_SERVER="lux1" DEFAULT_TAPE_DEVICE="/dev/null"
amandad:HAVE_MMAP HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID
amandad:DEBUG_CODE AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP
amandad:COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad:COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 0.000: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212


amandad: time 0.001: bsd security: remote host lux1 user amanda local user amanda
amandad: time 0.001: amandahosts security check passed
amandad: time 0.001: running service "/usr/local/libexec/selfcheck"
amandad: time 30.526: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 31.146: received dup P_REQ packet, ACKing it
amandad: time 31.146: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212


amandad: time 61.141: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 61.141: received dup P_REQ packet, ACKing it
amandad: time 61.141: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212






 


  
  
  


-- 
Steven M. Wilson, Systems and Network Manager
Markey Center for Structural Biology
Purdue University
[EMAIL PROTECTED]765.496.1946






FW: selfcheck hangs

2003-06-13 Thread Jeremy L. Mordkoff

one system has started refusing to run backups. amcheck reports a timeout. A ps on the 
client shows several orphaned selfcheck's. I did try killing all amandad's and hitting 
xinetd with a sigHUP, and then I tried an amcheck again, to no avail. I then 
reinstalled amanda and repeated. Still no. Here's the debug log. 

Any ideas would be appreciated.

JLM

-Original Message-
From:   root [mailto:[EMAIL PROTECTED]
Sent:   Fri 6/13/2003 9:20 AM
To: [EMAIL PROTECTED]
Cc: 
Subject:
amandad: debug 1 pid 23823 ruid 527 euid 527: start at Fri Jun 13 09:16:52 2003
amandad: version 2.4.3
amandad: build: VERSION=Amanda-2.4.3
amandad:BUILT_DATE=Fri Apr 4 10:37:17 EST 2003
amandad:BUILT_MACH=Linux lux1 2.4.18-18.7.xsmp #1 SMP Wed Nov 13 19:01:42 EST 
2002 i686 unknown
amandad:CC=gcc
amandad:CONFIGURE_COMMAND='./configure' '--with-user=amanda' 
'--with-group=disk'
amandad: paths: bindir=/usr/local/bin sbindir=/usr/local/sbin
amandad:libexecdir=/usr/local/libexec mandir=/usr/local/man
amandad:AMANDA_TMPDIR=/tmp/amanda AMANDA_DBGDIR=/tmp/amanda
amandad:CONFIG_DIR=/usr/local/etc/amanda DEV_PREFIX=/dev/
amandad:RDEV_PREFIX=/dev/ DUMP=/sbin/dump
amandad:RESTORE=/sbin/restore SAMBA_CLIENT=/usr/bin/smbclient
amandad:GNUTAR=/bin/gtar COMPRESS_PATH=/bin/gzip
amandad:UNCOMPRESS_PATH=/bin/gzip MAILER=/usr/bin/Mail
amandad:listed_incr_dir=/usr/local/var/amanda/gnutar-lists
amandad: defs:  DEFAULT_SERVER=lux1 DEFAULT_CONFIG=DailySet1
amandad:DEFAULT_TAPE_SERVER=lux1 DEFAULT_TAPE_DEVICE=/dev/null
amandad:HAVE_MMAP HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID
amandad:DEBUG_CODE AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:CLIENT_LOGIN=amanda FORCE_USERID HAVE_GZIP
amandad:COMPRESS_SUFFIX=.gz COMPRESS_FAST_OPT=--fast
amandad:COMPRESS_BEST_OPT=--best UNCOMPRESS_OPT=-dc
amandad: time 0.000: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 0.000: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212


amandad: time 0.001: bsd security: remote host lux1 user amanda local user amanda
amandad: time 0.001: amandahosts security check passed
amandad: time 0.001: running service /usr/local/libexec/selfcheck
amandad: time 30.526: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 31.146: received dup P_REQ packet, ACKing it
amandad: time 31.146: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212


amandad: time 61.141: got packet:

Amanda 2.4 REQ HANDLE 000-58790808 SEQ 1055510212
SECURITY USER amanda
SERVICE selfcheck
OPTIONS features=feff9f00;maxdumps=1;hostname=rel2;
DUMP hda3  0 OPTIONS |;auth=bsd;compress-fast;
DUMP vg01/lv_data  0 OPTIONS |;auth=bsd;compress-fast;


amandad: time 61.141: received dup P_REQ packet, ACKing it
amandad: time 61.141: sending ack:

Amanda 2.4 ACK HANDLE 000-58790808 SEQ 1055510212