Re: nfs4 client hangs

2012-08-24 Thread Orion Poplawski

On 08/22/2012 02:07 AM, Barbara Krasovec wrote:

Hello!

Nfs4 client randomly hangs on a file or directory (for instance when
performing df or find) and after a few seconds continues. When preforming
strace, it hangs on getdents system call.

Errors on the server side:
Aug 19 21:29:08 host kernel: nfs: server host2 not responding, still trying
Aug 19 21:31:17 host kernel: nfs: server host2 OK

After a while the load on client and server side increases intensively, then
system on the client side crashes. Reboot required.

We use SL6.3, the same problem occurred on the following kernels:
2.6.32-220.17.1.el6.x86_64
2.6.32-279.2.1.el6.x86_64

and also on:

3.5.2-1.el6.elrepo.x86_64
3.3.2-1.el6.elrepo.x86_64

The only difference is that when we are using 3.x kernel from elrepo, the
system crash never appears, just client hangs.

Mount options:
Flags:
rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=IP,local_lock=none,addr=IP


Nfs utils version:
nfs-utils-1.2.3-15.el6_2.1.x86_64

Any ideas?
Thanks,
Barbara


This doesn't look quite the same, but we had problems with hung tasks since 
updating to 6.3 on a couple machines.  Filed a bug here:


https://bugzilla.redhat.com/show_bug.cgi?id=851673

--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder Office  FAX: 303-415-9702
3380 Mitchell Lane   or...@nwra.com
Boulder, CO 80301   http://www.nwra.com


Re: nfs4 client hangs

2012-08-23 Thread Barbara Krasovec

On 08/23/2012 01:44 PM, Klaus Steinberger wrote:

Hi,


Nfs4 client randomly hangs on a file or directory (for instance when
performing df or find) and after a few seconds continues. When
preforming strace, it hangs on getdents system call.

Errors on the server side:
Aug 19 21:29:08 host kernel: nfs: server host2 not responding, still trying
Aug 19 21:31:17 host kernel: nfs: server host2 OK

After a while the load on client and server side increases intensively,
then system on the client side crashes. Reboot required.
Mount options:
Flags:
rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=IP,local_lock=none,addr=IP


SL5 or SL6 on Server side?

We had many hungs with SL5 as server but with kerberos auth, never tried NFS4
with sec=sys.

Sincerly,
Klaus

We have sl6 as a server and sl6 and sl5 clients... Nfs hangs occur on 
sl6 client... (at least I didn't notice any on sl5 clients)...


Soft mount isn't an option because of the nature of the service running 
on the machine.

Maybe I could try automount if there will be any difference..

Thanks,
Barbara


nfs4 client hangs

2012-08-22 Thread Barbara Krasovec

Hello!

Nfs4 client randomly hangs on a file or directory (for instance when 
performing df or find) and after a few seconds continues. When 
preforming strace, it hangs on getdents system call.


Errors on the server side:
Aug 19 21:29:08 host kernel: nfs: server host2 not responding, still trying
Aug 19 21:31:17 host kernel: nfs: server host2 OK

After a while the load on client and server side increases intensively, 
then system on the client side crashes. Reboot required.


We use SL6.3, the same problem occurred on the following kernels:
2.6.32-220.17.1.el6.x86_64
2.6.32-279.2.1.el6.x86_64

and also on:

3.5.2-1.el6.elrepo.x86_64
3.3.2-1.el6.elrepo.x86_64

The only difference is that when we are using 3.x kernel from elrepo, 
the system crash never appears, just client hangs.


Mount options:
Flags: 
rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=IP,local_lock=none,addr=IP


Nfs utils version:
nfs-utils-1.2.3-15.el6_2.1.x86_64

Any ideas?
Thanks,
Barbara


Re: nfs4 client hangs

2012-08-22 Thread zxq9

On 08/22/2012 05:07 PM, Barbara Krasovec wrote:

Nfs4 client randomly hangs on a file or directory (for instance when
performing df or find) and after a few seconds continues. When
preforming strace, it hangs on getdents system call.

Errors on the server side:
Aug 19 21:29:08 host kernel: nfs: server host2 not responding, still trying
Aug 19 21:31:17 host kernel: nfs: server host2 OK

After a while the load on client and server side increases intensively,
then system on the client side crashes. Reboot required.



Mount options:
Flags:
rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=IP,local_lock=none,addr=IP


I haven't had this problem with several systems mounting /home, but 
sometime before 2.6.32-220 we switched to using automount instead of fstab.


Have you tried playing with the options to see if one of them is 
poisonous (hard, retrans, local_lock, etc.)? Do you have the same 
problem using automount?