This kind of issue appeared with Ubuntu 23.10 for me on the server
mostly using an HDD for bulk storage with a not exactly powerful CPU
also being occupied with using WireGuard to secure the NFS connection.

Mentioning the performance details because I have a feeling they matter. An 
also not exactly high performance client connecting over 1 Gb/s only very 
occasionally caused this problem, however given a 10 Gb/s connection, the issue 
appeared significantly more commonly. A higher performance setup utilizing a 
2.5 Gb/s connection triggered this bug in a couple of days after setup.
The lockup always seem to occur with heavy NFS usage, suspiciously mostly when 
there's both reading and writing going on, at least I don't recall it happening 
with reading only, but I'm not confident in stating it didn't happen with a 
writing only load.

Found this bug report by the client error message, server side differs due to 
the different version:
```
[300146.046666] INFO: task nfsd:1426 blocked for more than 241 seconds.
[300146.046732]       Not tainted 6.5.0-27-generic #28~22.04.1-Ubuntu
[300146.046770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[300146.046813] task:nfsd            state:D stack:0     pid:1426  ppid:2      
flags:0x00004000
[300146.046827] Call Trace:
[300146.046832]  <TASK>
[300146.046839]  __schedule+0x2cb/0x750
[300146.046860]  schedule+0x63/0x110
[300146.046870]  schedule_timeout+0x157/0x170
[300146.046881]  wait_for_completion+0x88/0x150
[300146.046894]  __flush_workqueue+0x140/0x3e0
[300146.046908]  nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
[300146.047074]  nfsd4_destroy_session+0x193/0x260 [nfsd]
[300146.047219]  nfsd4_proc_compound+0x3b7/0x770 [nfsd]
[300146.047365]  nfsd_dispatch+0xbf/0x1d0 [nfsd]
[300146.047497]  svc_process_common+0x420/0x6e0 [sunrpc]
[300146.047695]  ? __pfx_read_tsc+0x10/0x10
[300146.047706]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[300146.047848]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[300146.047977]  svc_process+0x132/0x1b0 [sunrpc]
[300146.048157]  nfsd+0xdc/0x1c0 [nfsd]
[300146.048287]  kthread+0xf2/0x120
[300146.048299]  ? __pfx_kthread+0x10/0x10
[300146.048310]  ret_from_fork+0x47/0x70
[300146.048321]  ? __pfx_kthread+0x10/0x10
[300146.048331]  ret_from_fork_asm+0x1b/0x30
[300146.048341]  </TASK>
```

This seems to be matching, but the previous lockups experienced may have been 
somewhat different.
I mostly remember the client whining about the server not responding instead of 
the message presented here, and the server call trace used to have btrfs in it 
which made me suspect it may be exclusive to that, although the issue was 
always with NFS, nothing else locked up despite having some other sources of 
heavy I/O.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to