Public bug reported:

After running quite busy NFS4 server with ZFS as backend filesystem for
some time we get system crash with weekly regularity. Clients are
mounted with delegation propagation enabled and client mount options are
as follows:

type nfs4
(rw,nosuid,nodev,noexec,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=600,acregmax=600,acdirmin=600,acdirmax=600,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=y.y.y.y,local_lock=none,addr=x.x.x.x)

Server side configuration is

PIPEFS_MOUNTPOINT=/run/rpc_pipefs
RPCNFSDARGS="--grace-time 10 32"
RPCMOUNTDARGS="--manage-gids --num-threads=8"
STATDARGS=""
RPCSVCGSSDARGS=""
SVCGSSDARGS=""

The error happens in executing unhash_delegation_locked function called
from laundromat_main. Error on the console before reboot is below:

[2768169.862683] BUG: unable to handle page fault for address: ffffffffc09451a9
[2768169.863924] #PF: supervisor write access in kernel mode
[2768169.864790] #PF: error_code(0x0003) - permissions violation
[2768169.865695] PGD 3fe20e067 P4D 3fe20e067 PUD 3fe210067 PMD bf9c25067 PTE 
bf9f81161
[2768169.866895] Oops: 0003 [#1] SMP NOPTI
[2768169.867493] CPU: 8 PID: 4105769 Comm: kworker/u24:1 Tainted: P        W  
OE     5.3.0-46-generic #38~18.04.1-Ubuntu
[2768169.869154] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 
1.12.0-1 04/01/2014
[2768169.870447] Workqueue: nfsd4 laundromat_main [nfsd]
[2768169.871235] RIP: 0010:_raw_spin_lock+0x10/0x30
[2768169.871959] Code: 01 00 00 75 06 48 89 d8 5b 5d c3 e8 0a 13 66 ff 48 89 d8 
5b 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 31 c0 ba 01 00 00 00 <f0> 0f b1 
17 75 02 5d c3 89 c6 e8 c1 fa 65 ff 66 90 5d c3 0f 1f 00
[2768169.874528] RSP: 0018:ffffbe5ed12f7de0 EFLAGS: 00010246
[2768169.875177] RAX: 0000000000000000 RBX: ffffbe5ed12f7de8 RCX: 
0000000000000000
[2768169.876084] RDX: 0000000000000001 RSI: ffff9508089084e0 RDI: 
ffffffffc09451a9
[2768169.876993] RBP: ffffbe5ed12f7de0 R08: 000000000000077e R09: 
0000000000000004
[2768169.877942] R10: 0000000000000000 R11: 0000000000000001 R12: 
ffffffffc09451a9
[2768169.878793] R13: ffffbe5ed12f7e20 R14: ffffbe5ed12f7e40 R15: 
ffff9508089084e0
[2768169.879627] FS:  0000000000000000(0000) GS:ffff950d8fa00000(0000) 
knlGS:0000000000000000
[2768169.880624] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2768169.881359] CR2: ffffffffc09451a9 CR3: 0000000bd237c000 CR4: 
0000000000340ee0
[2768169.882241] Call Trace:
[2768169.882571]  unhash_delegation_locked+0x39/0xa0 [nfsd]
[2768169.883201]  laundromat_main+0x235/0x5a0 [nfsd]
[2768169.883756]  process_one_work+0x1fd/0x3f0
[2768169.884272]  worker_thread+0x34/0x410
[2768169.884725]  kthread+0x121/0x140
[2768169.885165]  ? process_one_work+0x3f0/0x3f0
[2768169.885730]  ? kthread_park+0xb0/0xb0
[2768169.886302]  ret_from_fork+0x22/0x40
[2768169.886837] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs 
xfs cpuid rpcsec_gss_krb5 rbd libceph ipt_REJECT nf_reject_ipv4 xt_set 
ip_set_hash_ipport xt_ipvs ip_set_hash_ip ip_set_hash_net ip_set dummy 
xt_tcpudp iptable_raw xt_CT veth xt_MASQUERADE xt_comment xt_mark iptable_nat 
iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack_netlink nfnetlink 
xfrm_user xfrm_algo aufs overlay zfs(POE) zunicode(PO) zavl(PO) icp(POE) 
zcommon(POE) znvpair(POE) spl(OE) zlua(POE) nls_iso8859_1 kvm_amd ccp kvm 
joydev input_leds irqbypass mac_hid serio_raw qemu_fw_cfg sch_fq_codel ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi br_netfilter bridge stp llc ip_vs_sh nfsd ip_vs_wrr 
ip_vs_rr auth_rpcgss ip_vs nfs_acl lockd nf_conntrack grace nf_defrag_ipv6 
nf_defrag_ipv4 sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear
[2768169.886885]  hid_generic usbhid hid crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel cirrus drm_kms_helper aesni_intel syscopyarea aes_x86_64 
crypto_simd sysfillrect sysimgblt cryptd fb_sys_fops glue_helper psmouse 
virtio_scsi virtio_net drm net_failover i2c_piix4 failover pata_acpi floppy
[2768169.902274] CR2: ffffffffc09451a9
[2768169.902777] ---[ end trace dcbbef50958ba3f7 ]---
[2768169.903440] RIP: 0010:_raw_spin_lock+0x10/0x30
[2768169.904064] Code: 01 00 00 75 06 48 89 d8 5b 5d c3 e8 0a 13 66 ff 48 89 d8 
5b 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 31 c0 ba 01 00 00 00 <f0> 0f b1 
17 75 02 5d c3 89 c6 e8 c1 fa 65 ff 66 90 5d c3 0f 1f 00
[2768169.907606] RSP: 0018:ffffbe5ed12f7de0 EFLAGS: 00010246
[2768169.908641] RAX: 0000000000000000 RBX: ffffbe5ed12f7de8 RCX: 
0000000000000000
[2768169.910010] RDX: 0000000000000001 RSI: ffff9508089084e0 RDI: 
ffffffffc09451a9
[2768169.911399] RBP: ffffbe5ed12f7de0 R08: 000000000000077e R09: 
0000000000000004
[2768169.912648] R10: 0000000000000000 R11: 0000000000000001 R12: 
ffffffffc09451a9
[2768169.913952] R13: ffffbe5ed12f7e20 R14: ffffbe5ed12f7e40 R15: 
ffff9508089084e0
[2768169.915217] FS:  0000000000000000(0000) GS:ffff950d8fa00000(0000) 
knlGS:0000000000000000
[2768169.916626] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2768169.917774] CR2: ffffffffc09451a9 CR3: 0000000bd237c000 CR4: 
0000000000340ee0
[2768169.919025] Kernel panic - not syncing: Fatal exception
[2768169.920317] Kernel Offset: 0x12400000 from 0xffffffff81000000 (relocation 
range: 0xffffffff80000000-0xffffffffbfffffff)
[2768169.922007] Rebooting in 10 seconds..

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-5.3.0-46-generic 5.3.0-46.38~18.04.1
ProcVersionSignature: Ubuntu 5.3.0-53.47~18.04.1-generic 5.3.18
Uname: Linux 5.3.0-53-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.9-0ubuntu7.11
Architecture: amd64
Date: Fri Jun 26 10:36:00 2020
Ec2AMI: ami-00000005
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: phx-c107
Ec2InstanceType: test-c4.4xlarge
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
ProcEnviron:
 LC_CTYPE=C.UTF-8
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-signed-hwe
UpgradeStatus: No upgrade log present (probably fresh install)

** Affects: linux-signed-hwe (Ubuntu)
     Importance: Undecided
         Status: Confirmed


** Tags: amd64 apport-bug bionic ec2-images

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1885265

Title:
  NFSd4 crashes system in unhash_delegation_locked

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe/+bug/1885265/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to