[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-10-07 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: nfs-utils (Ubuntu Noble)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-10-04 Thread Kleber Sacilotto de Souza
** Also affects: nfs-utils (Ubuntu Noble)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Noble)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Noble)
   Status: New => In Progress

** Changed in: linux (Ubuntu Noble)
 Assignee: (unassigned) => Mehmet Basaran (mehmetbasaran)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-30 Thread Peter Schubert
We installed the unofficial kernel 6.8.0-46-generic-nfs on several NFS client 
servers on Saturday and have been testing it with high IO loads since then.
Unfortunately the server crashed again after about 40 hours with "rcu: INFO: 
rcu_sched self-detected stall on CPU". 
The kernel 6.8.0-46-generic-nfs prevents the error message "RPC: Could not send 
backchannel reply error: -110", 
but not the crashs which we have been struggling with since August 19th 
switching the kernel from 6.5.0-44-generic to 6.8.0-40-generic.

Our experiences with NFS server crashes are:
- We were able to reproduce the crashes in our production and test 
environments. Rarely after minutes, sometimes after hours or days, but 
sometimes not at all, 
  as we often stopped the experiments after 12 to 24 hours.
- We have not yet been able to reproduce a crash between a bare metal NFS 
server and a bare metal NFS client, but between a bare metal NFS server and a 
virtualized client.
- we could not reproduce a crash with NFS vers=4.0 
- the crashs happens with or without GSSPROXY

Our setup:
- virtualized NFS 4.2 server with Ubuntu 22.04.5 LTS and kernel 
5.15.0-122-generic
- virtualized NFS client with Ubuntu 22.04.5 LTS and kernel 6.8.0-40-generic or 
kernel 6.8.0-45-generic
- /etc/exports :  /mnt/home  
nfsclient(sec=krb5,rw,root_squash,sync,no_subtree_check)
- /etc/fstab :  nfsserver:/mnt/home /home   nfs
vers=4.2,rw,soft,sec=krb5,proto=tcp  0  0
- apt info nfs-common : Version: 1:2.6.1-1ubuntu1.2

syslog of NFS server after crash:
Sep 30 01:15:51 nfs-server.domain.de kernel: rcu: INFO: rcu_sched self-detected 
stall on CPU
Sep 30 01:15:51 nfs-server.domain.de kernel: rcu: 54-: (14998 ticks 
this GP) idle=2db/1/0x4000 softirq=32173387/32173387 fqs=7449
Sep 30 01:15:51 nfs-server.domain.de kernel: (t=15000 jiffies 
g=144775177 q=49782)
Sep 30 01:15:51 nfs-server.domain.de kernel: NMI backtrace for cpu 54
Sep 30 01:15:51 nfs-server.domain.de kernel: CPU: 54 PID: 153154 Comm: 
kworker/u480:36 Not tainted 5.15.0-122-generic #132-Ubuntu
Sep 30 01:15:51 nfs-server.domain.de kernel: Hardware name: Microsoft 
Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 
12/17/2019
Sep 30 01:15:51 nfs-server.domain.de kernel: Workqueue: rpciod 
rpc_async_schedule [sunrpc]
Sep 30 01:15:51 nfs-server.domain.de kernel: Call Trace:
Sep 30 01:15:51 nfs-server.domain.de kernel:  
Sep 30 01:15:51 nfs-server.domain.de kernel:  show_stack+0x52/0x5c
Sep 30 01:15:51 nfs-server.domain.de kernel:  dump_stack_lvl+0x4a/0x63
Sep 30 01:15:51 nfs-server.domain.de kernel:  dump_stack+0x10/0x16
Sep 30 01:15:51 nfs-server.domain.de kernel:  nmi_cpu_backtrace.cold+0x4d/0x93
Sep 30 01:15:51 nfs-server.domain.de kernel:  ? lapic_can_unplug_cpu+0x90/0x90
Sep 30 01:15:51 nfs-server.domain.de kernel:  
nmi_trigger_cpumask_backtrace+0xec/0x100
Sep 30 01:15:51 nfs-server.domain.de kernel:  
arch_trigger_cpumask_backtrace+0x19/0x20
Sep 30 01:15:51 nfs-server.domain.de kernel:  
trigger_single_cpu_backtrace+0x44/0x4f
Sep 30 01:15:51 nfs-server.domain.de kernel:  rcu_dump_cpu_stacks+0x102/0x149
Sep 30 01:15:51 nfs-server.domain.de kernel:  print_cpu_stall.cold+0x2f/0xe2
Sep 30 01:15:51 nfs-server.domain.de kernel:  check_cpu_stall+0x1d8/0x270
Sep 30 01:15:51 nfs-server.domain.de kernel:  rcu_sched_clock_irq+0x9a/0x250
Sep 30 01:15:51 nfs-server.domain.de kernel:  update_process_times+0x94/0xd0
Sep 30 01:15:51 nfs-server.domain.de kernel:  tick_sched_handle+0x29/0x70
Sep 30 01:15:51 nfs-server.domain.de kernel:  tick_sched_timer+0x6f/0x90
Sep 30 01:15:51 nfs-server.domain.de kernel:  ? tick_sched_do_timer+0xa0/0xa0
Sep 30 01:15:51 nfs-server.domain.de kernel:  __hrtimer_run_queues+0x104/0x230
Sep 30 01:15:51 nfs-server.domain.de kernel:  ? read_hv_clock_tsc_cs+0x9/0x30
Sep 30 01:15:51 nfs-server.domain.de kernel:  hrtimer_interrupt+0x101/0x220
Sep 30 01:15:51 nfs-server.domain.de kernel:  hv_stimer0_isr+0x1d/0x30
Sep 30 01:15:51 nfs-server.domain.de kernel:  __sysvec_hyperv_stimer0+0x2f/0x70
Sep 30 01:15:51 nfs-server.domain.de kernel:  sysvec_hyperv_stimer0+0x7b/0x90
Sep 30 01:15:51 nfs-server.domain.de kernel:  
Sep 30 01:15:51 nfs-server.domain.de kernel:  
Sep 30 01:15:51 nfs-server.domain.de kernel:  
asm_sysvec_hyperv_stimer0+0x1b/0x20
Sep 30 01:15:51 nfs-server.domain.de kernel: RIP: 
0010:read_hv_clock_tsc+0x1b/0x60
Sep 30 01:15:51 nfs-server.domain.de kernel: Code: eb bc 66 66 2e 0f 1f 84 00 
00 00 00 00 66 90 8b 35 2a 89 97 02 85 f6 74 38 4c 8b 05 27 89 97 02 48 8b 3d 
28 89 97 02 0f 01 f9 <66> 90 8b 0d 0d 89 97 02 39 ce 75 d9 48 c1 e2 20 48 09 d0 
49 f7 e0
Sep 30 01:15:51 nfs-server.domain.de kernel: RSP: 0018:ada44ab33dc8 EFLAGS: 
0202
Sep 30 01:15:51 nfs-server.domain.de kernel: RAX: 5d52dc50 RBX: 
0002197146e8f7ec RCX: 0036
Sep 30 01:15:51 nfs-server.domain.de kernel: RDX: 000571f0 RSI: 
0002 RDI: 000a
Sep 30 01:15:51 nfs-server.domain.de kernel: RBP: 

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-29 Thread Mehmet Basaran
If this patch proves a fix, we plan to release it in the next update. In
this case, update will install the official kernel (which will also
include this patch) and change grub settings to boot this kernel
automatically.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-27 Thread Kleber Sacilotto de Souza
** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-26 Thread Robert Williams
Performed two upgrades from 22.04 yesterday, both have locked up
overnight with the below. It feels like this is the same issue - can
anyone confirm for me?

kernel: INFO: task nfsd:2029 blocked for more than 122 seconds.
kernel:   Tainted: G   OE  6.8.0-45-generic #45-Ubuntu
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
kernel: task:nfsdstate:D stack:0 pid:2029  tgid:2029  ppid:2
  flags:0x4000
kernel: Call Trace:
kernel:  
kernel:  __schedule+0x27c/0x6b0
kernel:  ? __smp_call_single_queue+0xe0/0x180
kernel:  schedule+0x33/0x110
kernel:  schedule_timeout+0x157/0x170
kernel:  wait_for_completion+0x88/0x150
kernel:  __flush_workqueue+0x140/0x3e0
kernel:  ? nfsd4_run_cb+0x30/0x70 [nfsd]
kernel:  nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
kernel:  nfsd4_destroy_session+0x186/0x260 [nfsd]
kernel:  nfsd4_proc_compound+0x3b7/0x780 [nfsd]
kernel:  nfsd_dispatch+0xd7/0x220 [nfsd]
kernel:  svc_process_common+0x450/0x710 [sunrpc]
kernel:  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
kernel:  svc_process+0x132/0x1b0 [sunrpc]
kernel:  svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
kernel:  svc_recv+0x18b/0x2e0 [sunrpc]
kernel:  ? __pfx_nfsd+0x10/0x10 [nfsd]
kernel:  nfsd+0x8b/0xe0 [nfsd]
kernel:  kthread+0xf2/0x120
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork+0x47/0x70
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork_asm+0x1b/0x30
kernel:  


Most annoying element is that nothing seems to allow recovery without a reload. 
Unless someone knows some trick to getting it back up?

These host VMs and the NFS share is purely for some bulk data backups. I
will shift them to the kernel mentioned above later today. If this
proves a fix, how soon may it roll out? I've got a large number of hosts
to move to 24.04 and will be holding off until this is fixed as it's
quite a showstopper.

Cheers!

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-25 Thread Mehmet Basaran
Hi all,

I am from the Canonical's kernel team and currently investigating this
issue. In this case, jammy-hwe, mantic-hwe, and noble by default uses
6.8 kernel (when a generic jammy and mantic is installed it uses hwe
version by default). So, the issue is with 6.8 kernel rather than
series.

I was not able to reproduce the error with generic 6.8.0-45.45 kernel
after 1 hour of stressing. I am still working on this. I really
appreciate all the feedback you provided.


Meanwhile, for those who are having the problem, I have created an unofficial 
version of 6.8.0-45.45 kernel which includes the upstream fix from 
"6ddc9deacc1312762c2edd9de00ce76b00f69f7c",
 - for jammy: 
https://launchpad.net/~mehmetbasaran/+archive/ubuntu/linux-hwe-6.8-6.8.0-45.45-nfs-patch
 - for noble: 
https://launchpad.net/~mehmetbasaran/+archive/ubuntu/linux-6.8.0-45.45-nfs-patch


Installation instructions:

Note that, if you are using secure boot, you will not be able to boot
into these kernels. You will need to disable it first.

# Add the unofficial ppa. Pick the correct one depending on your series
# For jammy: sudo add-apt-repository 
ppa:mehmetbasaran/linux-hwe-6.8-6.8.0-45.45-nfs-patch
# For noble: sudo add-apt-repository 
ppa:mehmetbasaran/linux-6.8.0-45.45-nfs-patch

$ sudo add-apt-repository ppa:mehmetbasaran/linux-6.8.0-45.45-nfs-patch
$ sudo apt update

$ sudo apt install linux-buildinfo-6.8.0-46-generic-nfs \
  linux-cloud-tools-6.8.0-46-generic-nfs \
  linux-cloud-tools-common \
  linux-headers-6.8.0-46-generic-nfs \
  linux-image-unsigned-6.8.0-46-generic-nfs \
  linux-modules-6.8.0-46-generic-nfs \
  linux-modules-extra-6.8.0-46-generic-nfs \
  linux-modules-ipu6-6.8.0-46-generic-nfs \
  linux-modules-iwlwifi-6.8.0-46-generic-nfs \
  linux-modules-usbio-6.8.0-46-generic-nfs \
  linux-nfs-6.8-cloud-tools-6.8.0-46 \
  linux-nfs-6.8-headers-6.8.0-46 \
  linux-nfs-6.8-tools-6.8.0-46 \
  linux-tools-6.8.0-46-generic-nfs

Next time you boot, you will be using the patched 6.8.0-45.45
$ uname -r
# 6.8.0-46-generic-nfs

To return back to the previous kernel (official 6.8.0-45.45) you just need to 
update grub:
$ grep 'menuentry \|submenu ' /boot/grub/grub.cfg | cut -f2 -d "'" # Prints 
available kernels on your machine, in my case:
  Ubuntu
  Advanced options for Ubuntu
  Ubuntu, with Linux 6.8.0-46-generic-nfs
  Ubuntu, with Linux 6.8.0-46-generic-nfs (recovery mode)
  Ubuntu, with Linux 6.8.0-45-generic
  Ubuntu, with Linux 6.8.0-45-generic (recovery mode)
  Ubuntu, with Linux 6.5.0-18-generic
  Ubuntu, with Linux 6.5.0-18-generic (recovery mode)


# Change GRUB_DEFAULT in /etc/default/grub
# from GRUB_DEFAULT=0
# to GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 
6.8.0-45-generic"

$ sudo update-grub
$ reboot

$ uname -r
# 6.8.0-45-generic


After changing your kernel to previous version you can completely remove the 
unofficial kernel:
  
# Now these packages will be safe to be removed
$ sudo apt remove linux-buildinfo-6.8.0-46-generic-nfs \
  linux-cloud-tools-6.8.0-46-generic-nfs \
  linux-headers-6.8.0-46-generic-nfs \
  linux-image-unsigned-6.8.0-46-generic-nfs \
  linux-modules-6.8.0-46-generic-nfs \
  linux-modules-extra-6.8.0-46-generic-nfs \
  linux-modules-ipu6-6.8.0-46-generic-nfs \
  linux-modules-iwlwifi-6.8.0-46-generic-nfs \
  linux-modules-usbio-6.8.0-46-generic-nfs \
  linux-nfs-6.8-cloud-tools-6.8.0-46 \
  linux-nfs-6.8-headers-6.8.0-46 \
  linux-nfs-6.8-tools-6.8.0-46 \
  linux-tools-6.8.0-46-generic-nfs

# Remove unofficial ppa from update list
$ sudo add-apt-repository --remove ppa:mehmetbasaran/linux-6.8.0-45.45-nfs-patch

# Restore grub settings
# Change GRUB_DEFAULT in /etc/default/grub to GRUB_DEFAULT=0
$ sudo update-grub

$ reboot

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-25 Thread Mehmet Basaran
Additionally, for those who prefer to migrate to newer kernel you can
try our mainline builds here: https://kernel.ubuntu.com/mainline/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-20 Thread Mehmet Basaran
** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Mehmet Basaran (mehmetbasaran)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-19 Thread DrewM
For those who can't switch to NFSv3 and don't want to run the .0 release
6.11.0 kernel from Oriole, there are 6.10 kernels from xanmod for Ubuntu
https://xanmod.org/#apt_repository

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-18 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: linux (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-18 Thread gbonazz...@bonaz.it
I also have this issue on Ubuntu 22.04.5 LTS with linux kernel tagged as

6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30
17:30:19 UTC 2

NFSv3 workaround has serious consequences on all the clients that
refused to downgrade the protocol.

My only option, till a patch,  is to downgrade to
vmlinuz-6.5.0-44-generic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-18 Thread Andreas Hasenack
Is anybody in a position to try out the kernel from the ubuntu 24.10
upcoming release? There will be a beta out this week of Ubuntu Oracular
24.10.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-18 Thread Andreas Hasenack
Thanks all for your input. I'll add a kernel task to this bug, but keep
the userspace one open for now.

** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-09-18 Thread DrewM
I also have this issue and can't go more than about 8 hours without this
breaking. This was not an issue in 20.04.

Currently attempting the NFSv3 workaround

There are mentions that this bug is fixed in kernel 6.9.8
https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-08-31 Thread Stefan
Confirmed on 24.04.1 and previously on 23.10 (both server and client),
also using large files (1-100GB) and 10G networking to large/fast disk
arrays, which others have suggested to be a key factor. All mountpoints
are running BTRFS (in some cases a brand new filesystem) without any
LUKS.

My observations with throughput also match, e.g.
host B as client connects to host A's nfs server and is high traffic, this 
fails after ~12 hours, requiring server A to reboot to recover
host A as client connects to host B's nfs server but is low traffic, and mount 
has not failed, even if neither servers rebooted for days

I have amended both my nfs.conf and fstab on all devices to force
nfsvers3 only as a workaround until there's a more permanent fix, or we
migrate to Debian

 
 __schedule+0x27c/0x6b0
 ? __smp_call_single_queue+0xfd/0x180
 schedule+0x33/0x110
 schedule_timeout+0x157/0x170
 wait_for_completion+0x88/0x150
 __flush_workqueue+0x140/0x3e0
 ? nfsd4_run_cb+0x30/0x70 [nfsd]
 nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
 nfsd4_destroy_session+0x186/0x260 [nfsd]
 nfsd4_proc_compound+0x3b7/0x780 [nfsd]
 nfsd_dispatch+0xd7/0x220 [nfsd]
 svc_process_common+0x450/0x710 [sunrpc]
 ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
 svc_process+0x132/0x1b0 [sunrpc]
 svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
 svc_recv+0x18b/0x2e0 [sunrpc]
 ? __pfx_nfsd+0x10/0x10 [nfsd]
 nfsd+0x8b/0xe0 [nfsd]
 kthread+0xf2/0x120
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x47/0x70
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1b/0x30
 

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-08-25 Thread Olle Liljenzin
Get the same in 22.04 now after 6.18 was rolled out as hwe kernel.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-07-31 Thread Derek Harter
This is a me too.  We encountered the same problem, with the exact same
message of a tainted and hung nfsd task.  Like yuhldr (yuh) we
investigated and problem happens regularly.

In our case, we have a smallish cluster (100 machines) with a gigabit
ethernet switch network.  The nfsd machine serves the cluster as storage
pool.  We just installed netdata (a nagios like server monitor tool) the
day before and suspect that was our root cause as we had not experienced
this hang for a few weeks before that after bringing up a new cloudstack
configuraiton on this cluster of machines.  We disabled netdata and have
not seen a reoccurance for 24 hours, though we are still monitoring.  We
did not try suggestion to set nfsversion to 3, but we may do that if we
decide we would like to get netdata back on this machine.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-06-27 Thread Ricardo Cruz
For the benefit of others... Like #10, we also have a cluster with this
issue. As a workaround, we are using version 3 of the NFS protocol
(`nfsvers=3` in `/etc/fstab`), which so far seems to have eliminated the
problem for us.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-06-23 Thread yuhldr
I encountered the same problem. After several days of testing, the
problem can be reproduced 100%. Ubuntu24.04, a 10Gb/s optical fiber
connection is used between the login node and the computing node. The
computing node uses nfs to mount the /home of the login node. The entire
system is managed using slurm.

 The login node submits files that require a large number of reading and
writing files on /home. During the program, my program reads local txt
of about 10GB in size by python-numpy, and then separates it into
multiple small files of 100MB and saves them as npy files.

I submit 252 similar programs at one time and run them at the same time,
within one hour. The nfs service of the login node is stuck. At this
time, the nfs-server service of the login node cannot be restarted. The
login node cannot ssh to the computing node, and the problem of
restarting the computing node still exists. However, the problem of just
restarting the login node disappears, ssh is restored, and the computing
node Automatically connect to nfs successfully.

```bash
root1548  0.0  0.0   5632  1792 ?Ss   18:19   0:00 
/usr/sbin/nfsdcld
root2347  4.6  0.0  0 0 ?D18:19   8:04 [nfsd]
root   53326  0.0  0.0  0 0 ?D20:00   0:00 
[kworker/u112:2+nfsd4_callbacks]
root   68918  0.0  0.0   2704  1792 ?Is   20:47   0:00 
/usr/sbin/rpc.nfsd 0
root   74448  0.0  0.0   9436  2240 pts/6S+   21:11   0:00 grep 
--color=auto --ex
```

```log
6月 23 20:48:52 icpcs systemd[1]: nfs-server.service: Stopping timed out. 
Terminating.
6月 23 20:49:10 icpcs sudo[69464]: root : TTY=pts/6 ; PWD=/root ; USER=root 
; COMMAND=/usr/bin/systemctl status nfs-server.service
6月 23 20:50:23 icpcs systemd[1]: nfs-server.service: State 'stop-sigterm' timed 
out. Killing.
6月 23 20:50:23 icpcs systemd[1]: nfs-server.service: Killing process 68918 
(rpc.nfsd) with signal SIGKILL.
6月 23 20:50:27 icpcs kernel: INFO: task nfsd:2347 blocked for more than 1105 
seconds.
6月 23 20:50:27 icpcs kernel: task:nfsdstate:D stack:0 pid:2347  
tgid:2347  ppid:2  flags:0x4000
6月 23 20:50:27 icpcs kernel:  nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
6月 23 20:50:27 icpcs kernel:  nfsd4_destroy_session+0x186/0x260 [nfsd]
6月 23 20:50:27 icpcs kernel:  nfsd4_proc_compound+0x3af/0x770 [nfsd]
6月 23 20:50:27 icpcs kernel:  nfsd_dispatch+0xd4/0x220 [nfsd]
6月 23 20:50:27 icpcs kernel:  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
6月 23 20:50:27 icpcs kernel:  ? __pfx_nfsd+0x10/0x10 [nfsd]
6月 23 20:50:27 icpcs kernel:  nfsd+0x8b/0xe0 [nfsd]
```

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-06-17 Thread Jeff
The tricky part for me is that client was regularly changing, so I can't
confidently say when did errors start appearing, it's just very
suspicious that it needs high load (as new host and higher network
bandwidth made the issue more frequent), and uploading to the server as
just pure downloading doesn't seem to be a problem even if cached data
is getting sent at full bandwidth for minutes.

Moved the server to 24.04, but I've also moved some I/O heavy tasks to
it so there would be less need of uploading. Client was on 23.10 and I'm
still holding back on upgrading for some more weeks.

Can't say a whole lot about the current situation as I'm not uploading much 
anymore to avoid the issue, but I actually ran into a hanging issue a few days 
ago, I just didn't have time to debug it, but the server didn't want to 
gracefully restart, so ended up hard rebooting.
I believe it was the first time since moving I/O heavy tasks, wanted to upload 
a few hundred GiB of data back to the server which was downloaded from there a 
while ago without problems. Otherwise light I/O doesn't seem to run into this 
problem, like the occasional backup to the server is fine, but that rarely 
saturates the network, and likely completely fits into the page cache almost 
every time.

A few hopefully helpful points for reproducing the problem:
- As mentioned multiple times, download alone seems to be unaffected, uploading 
is what should be stressed, and I suspect that either there's no need to 
download at the same time, or just casual filesystem browsing is a good enough 
load.
- A fast client with high bandwidth is key. Ran into this issue a couple times 
with an older host on 1 Gb/s, but a new fast host with 2.5 Gb/s made the issue 
appear significantly more frequently.
- Likely doesn't matter how the link gets saturated, but I either processed 
files cached on the server (mixed R/W), or uploaded cached files (fast SSD 
should be fine too), meaning that the bottleneck was always the network at 
least while the caches were large enough.
- Files were large, so there wasn't any stopping for fiddling with metadata as 
it would happen with small files, and the page cache was often exhausted. The 
target was a single HDD the majority of the time which often meant that writes 
started blocking (100-ish MiB/s HDD catching up with close to 250 MiB/s data), 
occasionally making the hosts freeze as the kernel's background I/O handling is 
still bad, we just pretend the issue is gone with SSDs being fast enough not to 
run into this. The page cache draining freezes may be good at exposing race 
conditions.

It may be more efficient to start looking for what's causing the "RPC: Could 
not send backchannel reply error: -110" log spam which might be related. The 
lockup may take significant time to catch while that kernel message showed up 
quite frequently.
Even now I have plenty of those lines without experiencing issues and not even 
uploading much, mostly just downloading large files.

Some extra info which may or may not matter:
- The server hardware is quite weak with an old 4 core Broadwell CPU, possibly 
helping to expose race condition problems
- All file systems are Btrfs with noatime,discard=async,compress-force=zstd , 
the later part surely adding more load
- LUKS is used everywhere, also adding some extra load
- There's a Btrfs (on LUKS) image mounted over NFS (with not a whole lot of 
usage though)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-06-17 Thread GuoqingJiang
IIUC, from nfs server side, both 23.10 (6.5 series) and 24.04 (6.8
series) have the similar issue, but 22.04 (probably 5.15.x kernel) was
ok.

And what is the kernel version from nfs client? Is it changed or stay on
one certain version?

Anyway, I guess the efficient way is to bisect which commit caused the
issue, thanks!

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-25 Thread Jeff
** Attachment added: "rpc_tasks.txt"
   
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+attachment/5770302/+files/rpc_tasks.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-25 Thread Jeff
Oh fun, trying to add multiple attachments, I just ended up finding a
bug report from 2007 complaining about apport being able to do this, but
the web interface is limited, so guess I'll get a bit spammy.

Ran into the same as usual again, even with avoiding heavy utilization.
Not seeing anything too interesting in the gathered info and apparently
the kernel log buffer is not large enough to be able to completely
handle a task list dump, but seems like most of the information is at
least there.

Will try to use RPCNFSDCOUNT=1 for some time in case this is a silly
deadlock, at least it's definitely easier to give it a try than to
downgrade to Ubuntu 22.04 which worked well.


** Attachment added: "dmesg.txt"
   
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+attachment/5770300/+files/dmesg.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-25 Thread Jeff
** Attachment added: "nfs_threads.txt"
   
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+attachment/5770301/+files/nfs_threads.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-22 Thread Jeff
Ran into this again just hours after commenting by attempting to unpack a large 
archive file. Apparently the new setup of higher performance host with more 
network bandwidth is just too overwhelming to be usable with NFS with this 
issue.
Reading alone still seems to be fine though.

Torturing 50% of the memory with memtester didn't reveal any problems,
and getting the server to do the unpacking of files does the same mixed
I/O with no issues so far, progressing way beyond where I could get over
NFS, so it doesn't look like an HDD issue either.

The info dumping described here seems to be potentially useful for the next 
catch, so linking it here:
https://www.spinics.net/lists/linux-nfs/msg97486.html

At some point I may try to reproduce the issue with just writing to see
if the mixed workload is really required for the freeze, but not sure
when will I get to it, can't just restart NFSd, and the mandatory reboot
to get it going again is only feasible when all other tasks running on
the host can be interrupted.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-21 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: nfs-utils (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-21 Thread Jeff
This kind of issue appeared with Ubuntu 23.10 for me on the server
mostly using an HDD for bulk storage with a not exactly powerful CPU
also being occupied with using WireGuard to secure the NFS connection.

Mentioning the performance details because I have a feeling they matter. An 
also not exactly high performance client connecting over 1 Gb/s only very 
occasionally caused this problem, however given a 10 Gb/s connection, the issue 
appeared significantly more commonly. A higher performance setup utilizing a 
2.5 Gb/s connection triggered this bug in a couple of days after setup.
The lockup always seem to occur with heavy NFS usage, suspiciously mostly when 
there's both reading and writing going on, at least I don't recall it happening 
with reading only, but I'm not confident in stating it didn't happen with a 
writing only load.

Found this bug report by the client error message, server side differs due to 
the different version:
```
[300146.04] INFO: task nfsd:1426 blocked for more than 241 seconds.
[300146.046732]   Not tainted 6.5.0-27-generic #28~22.04.1-Ubuntu
[300146.046770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[300146.046813] task:nfsdstate:D stack:0 pid:1426  ppid:2  
flags:0x4000
[300146.046827] Call Trace:
[300146.046832]  
[300146.046839]  __schedule+0x2cb/0x750
[300146.046860]  schedule+0x63/0x110
[300146.046870]  schedule_timeout+0x157/0x170
[300146.046881]  wait_for_completion+0x88/0x150
[300146.046894]  __flush_workqueue+0x140/0x3e0
[300146.046908]  nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
[300146.047074]  nfsd4_destroy_session+0x193/0x260 [nfsd]
[300146.047219]  nfsd4_proc_compound+0x3b7/0x770 [nfsd]
[300146.047365]  nfsd_dispatch+0xbf/0x1d0 [nfsd]
[300146.047497]  svc_process_common+0x420/0x6e0 [sunrpc]
[300146.047695]  ? __pfx_read_tsc+0x10/0x10
[300146.047706]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[300146.047848]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[300146.047977]  svc_process+0x132/0x1b0 [sunrpc]
[300146.048157]  nfsd+0xdc/0x1c0 [nfsd]
[300146.048287]  kthread+0xf2/0x120
[300146.048299]  ? __pfx_kthread+0x10/0x10
[300146.048310]  ret_from_fork+0x47/0x70
[300146.048321]  ? __pfx_kthread+0x10/0x10
[300146.048331]  ret_from_fork_asm+0x1b/0x30
[300146.048341]  
```

This seems to be matching, but the previous lockups experienced may have been 
somewhat different.
I mostly remember the client whining about the server not responding instead of 
the message presented here, and the server call trace used to have btrfs in it 
which made me suspect it may be exclusive to that, although the issue was 
always with NFS, nothing else locked up despite having some other sources of 
heavy I/O.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2062568] Re: nfsd gets unresponsive after some hours of operation

2024-04-19 Thread Olle Liljenzin
** Description changed:

- I installed the 22.04 Beta on two test machines that were running 22.04
+ I installed the 24.04 Beta on two test machines that were running 22.04
  without issues before. One of them exports two volumes that are mounted
  by the other machine, which primarily uses them as a secondary storage
  for ccache.
  
  After being up for a couple of hours (happened twice since yesterday
  evening) it seems that nfsd on the machine exporting the volumes hangs
  on something.
  
  From dmesg on the server (repeated a few times):
  
  [11183.290548] INFO: task nfsd:1419 blocked for more than 1228 seconds.
  [11183.290558]   Not tainted 6.8.0-22-generic #22-Ubuntu
  [11183.290563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [11183.290582] task:nfsdstate:D stack:0 pid:1419  tgid:1419  
ppid:2  flags:0x4000
  [11183.290587] Call Trace:
  [11183.290602]  
  [11183.290606]  __schedule+0x27c/0x6b0
  [11183.290612]  schedule+0x33/0x110
  [11183.290615]  schedule_timeout+0x157/0x170
  [11183.290619]  wait_for_completion+0x88/0x150
  [11183.290623]  __flush_workqueue+0x140/0x3e0
  [11183.290629]  nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
  [11183.290689]  nfsd4_destroy_session+0x186/0x260 [nfsd]
  [11183.290744]  nfsd4_proc_compound+0x3af/0x770 [nfsd]
  [11183.290798]  nfsd_dispatch+0xd4/0x220 [nfsd]
  [11183.290851]  svc_process_common+0x44d/0x710 [sunrpc]
  [11183.290924]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
  [11183.290976]  svc_process+0x132/0x1b0 [sunrpc]
  [11183.291041]  svc_handle_xprt+0x4d3/0x5d0 [sunrpc]
  [11183.291105]  svc_recv+0x18b/0x2e0 [sunrpc]
  [11183.291168]  ? __pfx_nfsd+0x10/0x10 [nfsd]
  [11183.291220]  nfsd+0x8b/0xe0 [nfsd]
  [11183.291270]  kthread+0xef/0x120
  [11183.291274]  ? __pfx_kthread+0x10/0x10
  [11183.291276]  ret_from_fork+0x44/0x70
  [11183.291279]  ? __pfx_kthread+0x10/0x10
  [11183.291281]  ret_from_fork_asm+0x1b/0x30
  [11183.291286]  
  
  From dmesg on the client (repeated a number of times):
  [ 6596.911785] RPC: Could not send backchannel reply error: -110
  [ 6596.972490] RPC: Could not send backchannel reply error: -110
  [ 6837.281307] RPC: Could not send backchannel reply error: -110
  
  ProblemType: Bug
  DistroRelease: Ubuntu 24.04
  Package: nfs-kernel-server 1:2.6.4-3ubuntu5
  ProcVersionSignature: Ubuntu 6.8.0-22.22-generic 6.8.1
  Uname: Linux 6.8.0-22-generic x86_64
  .etc.request-key.d.id_resolver.conf: create   id_resolver *   *   
/usr/sbin/nfsidmap -t 600 %k %d
  ApportVersion: 2.28.1-0ubuntu1
  Architecture: amd64
  CasperMD5CheckResult: pass
  Date: Fri Apr 19 14:10:25 2024
  InstallationDate: Installed on 2024-04-16 (3 days ago)
  InstallationMedia: Ubuntu-Server 24.04 LTS "Noble Numbat" - Beta amd64 
(20240410.1)
  NFSMounts:
-  
+ 
  NFSv4Mounts:
-  
+ 
  ProcEnviron:
-  LANG=en_US.UTF-8
-  PATH=(custom, no user)
-  SHELL=/bin/bash
-  TERM=xterm-256color
-  XDG_RUNTIME_DIR=
+  LANG=en_US.UTF-8
+  PATH=(custom, no user)
+  SHELL=/bin/bash
+  TERM=xterm-256color
+  XDG_RUNTIME_DIR=
  SourcePackage: nfs-utils
  UpgradeStatus: No upgrade log present (probably fresh install)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2062568

Title:
  nfsd gets unresponsive after some hours of operation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs