Package: kernel-image-2.6.8-2-686 Severity: normal Hi,
halting my cluster (~160 machines) usually results in about 5-6 hung ones. The hang happens at the bottom of rc0.d/S31umountnfs, issuing the umount command. I modified the script to produce some more output: /etc/rc0.d/S31umountnfs: [...] exec </dev/null if [ -n "$DIRS" ] then fuser -mv $DIRS # added by me echo umount $FLAGS $DIRS # added by me umount $FLAGS $DIRS fi ) </etc/mtab [...] On one occasion, I also got a "kernel BUG at fs/nfs/inode.c:151!", find some screenshots at http://tac.ki.iif.hu/kernelbug. It's not a hard lockup, magic sysrq can reboot the machines, they even emit IP traffic, as the above URL shows (it's a silently hung machine, not the one the screenshots were taken from). The machines are pure Sarge, NFS rooted, also mounting /home(rw) and /usr/local(ro) over NFS. If you need any further detail, don't hesitate to ask. The problem seems to only happen when lots of clients are trying to halt simultaneously on the same LAN. Also, their name in the RPC calls are (none), as it gets set after the root is mounted. I'm changing this and will report if that makes any difference. Thanks, Feri. -- System Information: Debian Release: 3.1 APT prefers unstable APT policy: (50, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.12-1-k7 Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]