Can anyone help lead us to a solution?
 
 
> >Subject: NFS problem on [machine]> >
> >> On the new cluster [x], we just found that for a few
> >> times, the master node was not responding [to] NFS requests.
> >>  Attached are lines grepped with 'nfs' from all
> >>'messages'
> >> file.
> >>
> >> This is a serious problem to us.  When the nfs server
> >> stops responding, many running jobs are restarted from
> >> scratch.  Is this a problem of the nfs configuration,
> >> oscar or the hardware?  What can we do to make NFS
> >>stable?
> >>  Please advise.
> >>
> >> Thanks,

[MORE INFORMATION]
 
> We noticed this when our big jobs (need 3 days) were
> restarted. From the log, it was happening more and more
> frequently. Any suggestion on identifying the source of
> the problem?
>
>
> [EMAIL PROTECTED] log]# grep nfs messages | grep not | wc -l
>      339
> [EMAIL PROTECTED] log]# grep nfs messages.1 | grep not | wc -l
>      381
> [EMAIL PROTECTED] log]# grep nfs messages.2 | grep not | wc -l
>        9
> [EMAIL PROTECTED] log]# grep nfs messages.3 | grep not | wc -l
>        0
> [EMAIL PROTECTED] log]# grep nfs messages.4 | grep not | wc -l
>        0
> [EMAIL PROTECTED] log]# ls -l messages*
> -rw-------    1 root     root       130926 Dec 29 15:45
> messages
> -rw-------    1 root     root       509784 Dec 28 04:02
> messages.1
> -rw-------    1 root     root       416508 Dec 21 04:02
> messages.2
> -rw-------    1 root     root       586158 Dec 14 04:02
> messages.3
> -rw-------    1 root     root       413372 Dec  7 04:02
> messages.4
[ ... more message ....]
messages:Dec 28 04:05:06 node7.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:05:30 node3.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:05:46 node2.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:05:57 node2.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:06:09 node3.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:06:34 node7.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:06:58 node7.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:07:48 node3.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:08:20 node2.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:09:01 node2.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:09:29 node7.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:09:53 node3.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:10:10 node3.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:10:38 node2.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:10:47 node3.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:11:06 node2.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:11:25 node3.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:11:35 node2.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:11:42 node3.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:11:47 node2.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:11:51 node3.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:12:08 node2.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:12:35 node2.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:13:14 node3.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:14:27 node3.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:15:36 node7.metis kernel: nfs: server nfs_oscar not responding, still trying
messages:Dec 28 04:16:30 node3.metis kernel: nfs: server nfs_oscar OK
messages:Dec 28 04:17:27 node7.metis kernel: nfs: server nfs_oscar OK >
[snip]

Reply via email to