Hello,

I'm having quite severe problems with NFS on Linux/Alpha. We're running
a cluster of 5 alphas. One is master node, other 4 are slaves. All slaves
mount root fs via NFS from master.

Sometimes slaves just stop to function, they obviously loose their root
filesystem. This happens in random intervals ranging from some hours
to some weeks. Remedy is to run command

   /usr/sbin/exportfs -r

on master node. Slaves function normaly afterwards.

During the time, when slaves can't access their NFS mounted volumes, I get
tons of messages on master, reading like this:

   Mar 27 06:07:05 cluster kernel: nfsd Security: /// bad export.

Not all exported volumes are inaccessible, rather only the volume which is
root fs for slaves.

My setup:

master:
  AlphaPC164 533MHz with 256MB RAM
  Adaptec 2940UW + 2 SCSI HDs (the same problems occured with
    QLogic ISP1040B SCSI card)
  2xTulip NIC (DEC DC21140)
  RedHat 6.1 with modular kernel 2.2.14

slaves:
  AlphaPC164 533MHz with 128MB RAM
  IDE HD (kernel image, swap and /var only)
  Tulip NIC (DEC DC21140)
  RedHat 6.1 with modular kernel 2.2.13


A side note: from time to time I get the following message to the console
and log files:

Feb 22 03:55:23 slave1 kernel: PYXIS machine check: vector=0x670 pc=0xfffffc000040ffc0 
code=0x98
Feb 22 03:55:23 slave1 kernel: machine check type: processor detected hard error

I get those messages every some days on all machines (but not on the same
moment). Should I worry about those messages?

My question is: are the problems with NFS due to kernel problem, due to
hardware problem or something else?

Regards,
  Metod

Metod Kozelj

mailto:[EMAIL PROTECTED]            /\  Ne posiljajte mi smeti ker grizem!
http://www.rzs-hm.si/                   /  \  Don't spam me for I bite!
_______________________________________/    \__________________________________

---- perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'

Reply via email to