Re: Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations
Hi Rick, Done - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280978 Thanks! -Matt On 8/21/24 10:45 AM, Rick Macklem wrote: > Please create a PR for this and include at least > one backtrace. I will try and figure out how > locallocks could cause it. > > I suspect few use locallocks=1. > > rick > > On Wed, Aug 21, 2024 at 7:29 AM Matthew L. Dailey > mailto:matthew.l.dai...@dartmouth.edu>> > wrote: > > Hi all, > > I posted messages to the this list back in February and March > > (https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html > <https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html>) > regarding kernel panics we were having with nfs clients doing hdf5 file > operations. After a hiatus in troubleshooting, I had more time this > summer and have found the cause - the vfs.nfsd.enable_locallocks sysctl. > > When this is set to 1, we can induce either a panic or hung nfs server > (more rarely) usually within a few hours, but sometimes within several > days to a week. We have replicated this on 13.0 through 15.0-CURRENT > (20240725-82283cad12a4-271360). With this set to 0 (default), we are > unable to replicate the issue, even after several weeks of 24/7 hdf5 > file operations. > > One other side-effect of these panics is that on a few occasions it has > corrupted the root zpool beyond repair. This makes sense since kernel > memory is getting corrupted, but obviously makes this issue more > impactful. > > I'm hoping this is enough information to start narrowing down this > issue. We are specifically using this sysctl because we are also > serving > files via samba and want to ensure consistent locking. > > I have provided some core dumps and backtraces previously, but am happy > to provide more as needed. I also have a writeup of exactly how to > reproduce this that I can send directly to anyone who is interested. > > Thanks so much for any and all help with this tricky problem. I'm happy > to do whatever I can to help get this squashed. > > Best, > Matt >
Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations
Hi all, I posted messages to the this list back in February and March (https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html) regarding kernel panics we were having with nfs clients doing hdf5 file operations. After a hiatus in troubleshooting, I had more time this summer and have found the cause - the vfs.nfsd.enable_locallocks sysctl. When this is set to 1, we can induce either a panic or hung nfs server (more rarely) usually within a few hours, but sometimes within several days to a week. We have replicated this on 13.0 through 15.0-CURRENT (20240725-82283cad12a4-271360). With this set to 0 (default), we are unable to replicate the issue, even after several weeks of 24/7 hdf5 file operations. One other side-effect of these panics is that on a few occasions it has corrupted the root zpool beyond repair. This makes sense since kernel memory is getting corrupted, but obviously makes this issue more impactful. I'm hoping this is enough information to start narrowing down this issue. We are specifically using this sysctl because we are also serving files via samba and want to ensure consistent locking. I have provided some core dumps and backtraces previously, but am happy to provide more as needed. I also have a writeup of exactly how to reproduce this that I can send directly to anyone who is interested. Thanks so much for any and all help with this tricky problem. I'm happy to do whatever I can to help get this squashed. Best, Matt
Re: FreeBSD panics possibly caused by nfs clients
Posting a few updates on this issue. I was able to induce a panic on a CURRENT kernel (20240215), built with GENERIC-KASAN and running kern.kstack_pages=6 (default) after ~189 hours. The panic message and backtrace are below - please reach out directly if you'd like to have a look at the core. I'm specifically not increasing kstack_pages to see what effect this has on the panics. I have another system running CURRENT (20240215) without any debugging. I'm able to regularly panic this (7 panics over two weeks with average uptime of ~42 hours) with kstack_pages=4. I set kstack_pages=6, and have also induced several panics. Oddly, this seems to happen more quickly (4 panics over 2 days with average uptime of ~10.5 hours). Another system running CURRENT (20240208), built with GENERIC-KASAN and running kern.kstack_pages=8 has now been running with our hdf5 workload non-stop since February 10th with no panics or issues. From all this, it seems like increasing kstack_pages to 8 eliminates the panics, but obviously doesn't fix the underlying cause of the issues. So, although this is an obvious workaround for our production system, it would be better to find and fix the underlying cause of the panics. I'm going to continue testing to try to generate more cores with kstack_pages<8 on the system with the debug kernel. Any other ideas to try to narrow down the cause are welcome. Thanks, Matt [680940] Kernel page fault with the following non-sleepable locks held: [680940] exclusive sleep mutex nfs_state_mutex (nfs_state_mutex) r = 0 (0x830498e0) locked @ /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:6652 [680940] stack backtrace: [680940] #0 0x8127958f at witness_debugger+0x13f [680940] #1 0x8127b114 at witness_warn+0x674 [680940] #2 0x81aba0a6 at trap_pfault+0x116 [680940] #3 0x81ab901c at trap+0x54c [680940] #4 0x81a75988 at calltrap+0x8 [680940] #5 0x80fb4bfa at nfsrv_freestateid+0x23a [680940] #6 0x80fd5e3f at nfsrvd_freestateid+0x1df [680940] #7 0x80f98d35 at nfsrvd_dorpc+0x2585 [680940] #8 0x80fbf588 at nfssvc_program+0x1078 [680940] #9 0x8173fce6 at svc_run_internal+0x1706 [680940] #10 0x8174094b at svc_thread_start+0xb [680940] #11 0x811137a3 at fork_exit+0xa3 [680940] #12 0x81a769ee at fork_trampoline+0xe [680940] [680940] [680940] Fatal trap 12: page fault while in kernel mode [680940] cpuid = 3; apic id = 06 [680940] fault virtual address = 0x7 [680940] fault code = supervisor read data, page not present [680940] instruction pointer= 0x20:0x80fafd67 [680940] stack pointer = 0x28:0xfe0153ba2de0 [680940] frame pointer = 0x28:0xfe0153ba2eb0 [680940] code segment = base 0x0, limit 0xf, type 0x1b [680940]= DPL 0, pres 1, long 1, def32 0, gran 1 [680940] processor eflags = interrupt enabled, resume, IOPL = 0 [680940] current process= 55202 (nfsd: service) [680940] rdi: 0007 rsi: rdx: d7c0 [680940] rcx: fe001b9ec1e8 r8: 0012c4350002 r9: 0012c4350002 [680940] rax: fe001b9ec1e8 rbx: rbp: fe0153ba2eb0 [680940] r10: 0004 r11: 0006 r12: 0007 [680940] r13: fe019cd75700 r14: 1a1a r15: fe019cd75708 [680940] trap number= 12 [680940] panic: page fault [680940] cpuid = 3 [680940] time = 1709646178 [680940] KDB: stack backtrace: [680940] db_trace_self_wrapper() at db_trace_self_wrapper+0xa5/frame 0xfe0153ba2550 [680940] kdb_backtrace() at kdb_backtrace+0xc6/frame 0xfe0153ba26b0 [680940] vpanic() at vpanic+0x210/frame 0xfe0153ba2850 [680940] panic() at panic+0xb5/frame 0xfe0153ba2910 [680940] trap_fatal() at trap_fatal+0x65b/frame 0xfe0153ba2a10 [680940] trap_pfault() at trap_pfault+0x12b/frame 0xfe0153ba2b30 [680940] trap() at trap+0x54c/frame 0xfe0153ba2d10 [680940] calltrap() at calltrap+0x8/frame 0xfe0153ba2d10 [680940] --- trap 0xc, rip = 0x80fafd67, rsp = 0xfe0153ba2de0, rbp = 0xfe0153ba2eb0 --- [680940] nfsrv_freelockowner() at nfsrv_freelockowner+0x97/frame 0xfe0153ba2eb0 [680940] nfsrv_freestateid() at nfsrv_freestateid+0x23a/frame 0xfe0153ba2f70 [680940] nfsrvd_freestateid() at nfsrvd_freestateid+0x1df/frame 0xfe0153ba3030 [680940] nfsrvd_dorpc() at nfsrvd_dorpc+0x2585/frame 0xfe0153ba3570 [680940] nfssvc_program() at nfssvc_program+0x1078/frame 0xfe0153ba3970 [680940] svc_run_internal() at svc_run_internal+0x1706/frame 0xfe0153ba3ee0 [680940] svc_thread_start() at svc_thread_start+0xb/frame 0xfe0153ba3ef0 [680940] fork_exit() at fork_exit+0xa3/frame 0xfe0153ba3f30 [680940] fork_trampoline() at fork_trampoline+0xe/frame 0xfe0153ba3f30 [680940] --- trap 0xc, rip = 0x3b4ff896f0da, rsp = 0x3b4ff6a500e8, rbp = 0x3b4ff6a50380 --- [680940] KDB: enter:
Re: FreeBSD panics possibly caused by nfs clients
Hi all, I induced a panic on my CURRENT (20240215-d79b6b8ec267-268300) VM after about 24 hours. This is the one without any debugging, so it only confirms the fact that the panics we've been experiencing still exist in CURRENT. There was some disk issue that prevented the dump, so all I have is the panic, pasted below. The two test systems with full debugging are still running after a week and a half. > You might want to set > kern.kstack_pages=6 > in /boot/loader.conf in these setups. > > I would normally expect double faults when a kernel stack is blown, > but maybe there is a reason that you do now see that for a blown kernel > stack. (The impact of increasing stack pages from 4->6 should be minimal.) > > rick Rick - I'm a little confused by the kstack_pages tunable and just want to clarify. Are you proposing that this might solve the panic issues we've been having, or that it will make the panics/dumps more useful by avoiding false positives? We've only ever seen that "double fault" once in over 100 observed panics, and that was only when we enabled just KASAN on a 14.0p4 system. -Matt [85751] Fatal trap 12: page fault while in kernel mode [85751] cpuid = 3; apic id = 06 [85751] fault virtual address = 0x4f0f760 [85751] fault code = supervisor read data, page not present [85751] instruction pointer= 0x20:0x820022f7 [85751] stack pointer = 0x28:0xfe010bdf8d50 [85751] frame pointer = 0x28:0xfe010bdf8d80 [85751] code segment = base 0x0, limit 0xf, type 0x1b [85751]= DPL 0, pres 1, long 1, def32 0, gran 1 [85751] processor eflags = interrupt enabled, resume, IOPL = 0 [85751] current process= 0 (z_wr_int_h_3) [85751] rdi: f802d1036900 rsi: f80416887300 rdx: f80416887380 [85751] rcx: f802d1036908 r8: 0100 r9: 8013070f000700ff [85751] rax: 04f0f748 rbx: f802d1036900 rbp: fe010bdf8d80 [85751] r10: f80412c4f708 r11: r12: f8000944ed58 [85751] r13: r14: 04f0f748 r15: fe010caa9438 [85751] trap number= 12 [85751] panic: page fault [85751] cpuid = 3 [85751] time = 1708451091 [85751] KDB: stack backtrace: [85751] #0 0x80b9803d at kdb_backtrace+0x5d [85751] #1 0x80b4a8d5 at vpanic+0x135 [85751] #2 0x80b4a793 at panic+0x43 [85751] #3 0x81026b8f at trap_fatal+0x40f [85751] #4 0x81026bdf at trap_pfault+0x4f [85751] #5 0x80ffd9f8 at calltrap+0x8 [85751] #6 0x81fea83b at dmu_sync_late_arrival_done+0x6b [85751] #7 0x8214a78e at zio_done+0xc6e [85751] #8 0x821442cc at zio_execute+0x3c [85751] #9 0x80bae402 at taskqueue_run_locked+0x182 [85751] #10 0x80baf692 at taskqueue_thread_loop+0xc2 [85751] #11 0x80b0484f at fork_exit+0x7f [85751] #12 0x80ffea5e at fork_trampoline+0xe [85751] Uptime: 23h49m11s
Re: FreeBSD panics possibly caused by nfs clients
Hi all, So I finally induced a panic on a "pure" ufs system - root and exported filesystem were both ufs. So, I think this definitively rules out zfs as a source of the issue. This panic was on 14.0p5 without debugging options, so the core may not be helpful. The panic and backtrace are below in case they're interesting to anyone. Next, I'm going to try a CURRENT kernel without debugging options enabled just to see if I can finally induce a panic here. My other two VMs running CURRENT with full debugging are still clanking along. -Matt [218716] Fatal trap 12: page fault while in kernel mode [218716] cpuid = 4; apic id = 08 [218716] fault virtual address = 0x10017 [218716] fault code = supervisor read data, page not present [218716] instruction pointer= 0x20:0x80e9165d [218716] stack pointer = 0x28:0xfe010b5aa3b0 [218716] frame pointer = 0x28:0xfe010b5aa400 [218716] code segment = base 0x0, limit 0xf, type 0x1b [218716]= DPL 0, pres 1, long 1, def32 0, gran 1 [218716] processor eflags = interrupt enabled, resume, IOPL = 0 [218716] current process= 49575 (nfsd: service) [218716] rdi: rsi: f800038ec900 rdx: fe00d9326000 [218716] rcx: 00030eb0 r8: r9: fe010b5aa410 [218716] rax: 008f0eb0 rbx: f8038ac4cd00 rbp: fe010b5aa400 [218716] r10: r11: r12: [218716] r13: f80003647c00 r14: f802f9dced00 r15: f800038ec900 [218716] trap number= 12 [218716] panic: page fault [218716] cpuid = 4 [218716] time = 1708319487 [218716] KDB: stack backtrace: [218716] #0 0x80b9309d at kdb_backtrace+0x5d [218716] #1 0x80b461a2 at vpanic+0x132 [218716] #2 0x80b46063 at panic+0x43 [218716] #3 0x8101d85c at trap_fatal+0x40c [218716] #4 0x8101d8af at trap_pfault+0x4f [218716] #5 0x80ff3fe8 at calltrap+0x8 [218716] #6 0x80e8716e at newdirrem+0x8be [218716] #7 0x80e866fa at softdep_setup_remove+0x1a [218716] #8 0x80ea71af at ufs_dirremove+0x21f [218716] #9 0x80ead4f4 at ufs_remove+0xb4 [218716] #10 0x810f1428 at VOP_REMOVE_APV+0x28 [218716] #11 0x80a60db4 at nfsvno_removesub+0xc4 [218716] #12 0x80a52699 at nfsrvd_remove+0x1b9 [218716] #13 0x80a374d4 at nfsrvd_dorpc+0x1854 [218716] #14 0x80a4e76f at nfssvc_program+0x82f [218716] #15 0x80e34080 at svc_run_internal+0xb50 [218716] #16 0x80e3475b at svc_thread_start+0xb [218716] #17 0x80b00b7f at fork_exit+0x7f [218716] Uptime: 2d12h45m16s [218716] Dumping 985 out of 16350 MB:..2%..12%..22%..31%..41%..51%..61%..72%..82%..91% #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 #1 doadump (textdump=) at /usr/src/sys/kern/kern_shutdown.c:405 #2 0x80b45d37 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:526 #3 0x80b4620f in vpanic (fmt=0x81147c9c "%s", ap=ap@entry=0xfe010b5aa200) at /usr/src/sys/kern/kern_shutdown.c:970 #4 0x80b46063 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:894 #5 0x8101d85c in trap_fatal (frame=0xfe010b5aa2f0, eva=4294967319) at /usr/src/sys/amd64/amd64/trap.c:952 #6 0x8101d8af in trap_pfault (frame=0xfe010b5aa2f0, usermode=false, signo=, ucode=) at /usr/src/sys/amd64/amd64/trap.c:760 #7 #8 cancel_diradd (dap=0xf8038ac4cd00, dirrem=dirrem@entry=0xf800038ec900, jremref=jremref@entry=0xf802f9dced00, dotremref=dotremref@entry=0x0, dotdotremref=dotdotremref@entry=0x0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:9028 #9 0x80e8716e in newdirrem (bp=, dp=dp@entry=0xf800037fea80, ip=ip@entry=0xf8006b3b9300, isrmdir=isrmdir@entry=0, prevdirremp=prevdirremp@entry=0xfe010b5aa4b0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:9480 #10 0x80e866fa in softdep_setup_remove (bp=0x, dp=0xf800038ec900, dp@entry=0xf800037fea80, ip=0xfe00d9326000, ip@entry=0xf8006b3b9300, isrmdir=200368, isrmdir@entry=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:9176 #11 0x80ea71af in ufs_dirremove (dvp=dvp@entry=0xf801f764be00, ip=ip@entry=0xf8006b3b9300, flags=, isrmdir=isrmdir@entry=0) at /usr/src/sys/ufs/ufs/ufs_lookup.c:1198 #12 0x80ead4f4 in ufs_remove (ap=0xfe010b5aa5d8) at /usr/src/sys/ufs/ufs/ufs_vnops.c:1054 #13 0x810f1428 in VOP_REMOVE_APV ( vop=0x8172f2d0 , a=a@entry=0xfe010b5aa5d8) at vnode_if.c:1534 #14 0x80a60db4 in VOP_REMOVE (dvp=0x8f0eb0, vp=0xf800539b7380, cnp=0x30eb0) at ./vnode_if.h:789 #15 nfsvno_removesub (ndp=0xfe010b5aa858, is_v4=, cred=, p=p@entry=0xfe010ae803a0, exp=exp@entry=0xfe010b5aaa88) at /usr/src/sys/fs/nfsserver/nfs_nfsdpor
Re: FreeBSD panics possibly caused by nfs clients
Hi all, Before the week was out, I wanted to provide an update on this issue. Last weekend, I installed two VMs with CURRENT (20240208-82bebc793658-268105) - one on zfs and one on ufs - and built a kernel with this config file: include GENERIC ident THAYER-FULLDEBUG makeoptions DEBUG=-g options KASAN options DDB options INVARIANT_SUPPORT options INVARIANTS options QUEUE_MACRO_DEBUG_TRASH options WITNESS options WITNESS_SKIPSPIN options KGSSAPI I'm also setting these in loader.conf: debug.witness.watch=1 debug.witness.kdb=1 kern.kstack_pages=8 These two VMs have been running non-stop with our hdf5 workload without a panic for 146 hours and 122 hours, respectively. This might be good news, but is well within the threshold we've seen in our testing over the past 6 months. Given that all the debug kernel options slow things down significantly, these could just be taking a long while to panic. I also have a another VM with our "standard" 14.0p5 kernel (GENERIC with KGSSAPI enabled) running on ufs to try to rule in or out zfs. This failed this morning, but not with a panic. In this case, nfs stopped responding. This is a failure mode we have seen in our testing, but is much rarer than a full panic. I intend to continue testing this to try to induce a panic, at which point I think we can rule out zfs as a potential cause. Just so it's documented, since I started experimenting with kernel debug options last week, I have so far induced panics with the following: - 13.2p9 kernel on hardware (only WITNESS enabled) - 14.0p4 kernel on VM (only KASAN enabled) - 13.2p9 kernel on hardware (all debug options above except KASAN) My plan right now is to continue running my two test VMs with CURRENT to see if it's just taking a long time to panic. Once I have finished my ufs testing on the third VM, I will build a GENERIC kernel for CURRENT (no debug options, only KGSSAPI) and test against that to see if the actual debug instrumentation is interfering with reproducing this issue. Please reach out if you have ideas or suggestions. I'll provide updates here when I have them. Thanks, Matt
Re: FreeBSD panics possibly caused by nfs clients
On 2/9/24 4:18 PM, Mark Johnston wrote: > [You don't often get email from ma...@freebsd.org. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > > On Fri, Feb 09, 2024 at 06:23:08PM +, Matthew L. Dailey wrote: >> I had my first kernel panic with a KASAN kernel after only 01:27. This >> first panic was a "double fault," which isn't anything we've seen >> previously - usually we've seen trap 9 or trap 12, but sometimes others. >> Based on the backtrace, it definitely looks like KASAN caught something, >> but I don't have the expertise to know if this points to anything >> specific. From the backtrace, it looks like this might have originated >> in ipfw code. > > A double fault is rather unexpected. I presume you're running > releng/14.0? Is it at all possible to test with FreeBSD-CURRENT? This is just 14.0-RELEASE, updated to p4. Haven't played with CURRENT before, but I'll give it a shot. > > Did you add INVARIANTS etc. to the kernel configuration used here, or > just KASAN? > This just had KASAN. Are you suggesting that I only add INVARIANTS (along with KASAN), or all the various debug bits: options DDB options INVARIANT_SUPPORT options INVARIANTS options QUEUE_MACRO_DEBUG_TRASH options WITNESS options WITNESS_SKIPSPIN >> Please let me know what other info I can provide or what I can do to dig >> deeper. > > If you could repeat the test several times, I'd be interested in seeing > if you always get the same result. If you're willing to share the > vmcore (or several), I'd be willing to take a look at it. > We'll see how things go with testing, and I'll pass along cores directly if it makes sense. There shouldn't be anything sensitive that we care about. >> Thanks!! >> >> Panic message: >> [5674] Fatal double fault >> [5674] rip 0x812f6e32 rsp 0xfe014677afe0 rbp 0xfe014677b430 >> [5674] rax 0x1fc028cef620 rdx 0xf2f2f2f8f2f2f2f2 rbx 0x1 >> [5674] rcx 0xd7c0 rsi 0xfe004086a4a0 rdi 0xf8f8f8f8f2f2f2f8 >> [5674] r8 0xf8f8f8f8f8f8f8f8 r9 0x162a r10 0x835003002d3a64e1 >> [5674] r11 0 r12 0xf78028cef620 r13 0xfe004086a440 >> [5674] r14 0xfe01488c0560 r15 0x26f40 rflags 0x10006 >> [5674] cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b >> [5674] fsbase 0x95d1d81a130 gsbase 0x84a14000 kgsbase 0 >> [5674] cpuid = 4; apic id = 08 >> [5674] panic: double fault >> [5674] cpuid = 4 >> [5674] time = 1707498420 >> [5674] KDB: stack backtrace: >> [5674] Uptime: 1h34m34s >> >> Backtrace: >> #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 >> #1 doadump (textdump=) at >> /usr/src/sys/kern/kern_shutdown.c:405 >> #2 0x8128b7dc in kern_reboot (howto=howto@entry=260) >> at /usr/src/sys/kern/kern_shutdown.c:526 >> #3 0x8128c000 in vpanic ( >> fmt=fmt@entry=0x82589a00 "double fault", >> ap=ap@entry=0xfe0040866de0) at >> /usr/src/sys/kern/kern_shutdown.c:970 >> #4 0x8128bd75 in panic (fmt=0x82589a00 "double >> fault") >> at /usr/src/sys/kern/kern_shutdown.c:894 >> #5 0x81c4b335 in dblfault_handler (frame=) >> at /usr/src/sys/amd64/amd64/trap.c:1012 >> #6 >> #7 0x812f6e32 in sched_clock (td=td@entry=0xfe01488c0560, >> cnt=cnt@entry=1) at /usr/src/sys/kern/sched_ule.c:2601 >> #8 0x8119e2a7 in statclock (cnt=cnt@entry=1, >> usermode=usermode@entry=0) at /usr/src/sys/kern/kern_clock.c:760 >> #9 0x8119fb67 in handleevents (now=now@entry=24371855699832, >> fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:195 >> #10 0x811a10cc in timercb (et=, arg=) >> at /usr/src/sys/kern/kern_clocksource.c:353 >> #11 0x81dcd280 in lapic_handle_timer (frame=0xfe014677b750) >> at /usr/src/sys/x86/x86/local_apic.c:1343 >> #12 >> #13 __asan_load8_noabort (addr=18446741880219689232) >> at /usr/src/sys/kern/subr_asan.c:1113 >> #14 0x851488b8 in ?? () from /boot/thayer/ipfw.ko >> #15 0xfe01 in ?? () >> #16 0x8134dcd5 in pcpu_find (cpuid=1238425856) >> at /usr/src/sys/kern/subr_pcpu.c:286 >> #17 0x85151f6f in ?? () from /boot/thayer/ipfw.ko >> #18 0x in ?? ()
Re: FreeBSD panics possibly caused by nfs clients
I had my first kernel panic with a KASAN kernel after only 01:27. This first panic was a "double fault," which isn't anything we've seen previously - usually we've seen trap 9 or trap 12, but sometimes others. Based on the backtrace, it definitely looks like KASAN caught something, but I don't have the expertise to know if this points to anything specific. From the backtrace, it looks like this might have originated in ipfw code. Please let me know what other info I can provide or what I can do to dig deeper. Thanks!! Panic message: [5674] Fatal double fault [5674] rip 0x812f6e32 rsp 0xfe014677afe0 rbp 0xfe014677b430 [5674] rax 0x1fc028cef620 rdx 0xf2f2f2f8f2f2f2f2 rbx 0x1 [5674] rcx 0xd7c0 rsi 0xfe004086a4a0 rdi 0xf8f8f8f8f2f2f2f8 [5674] r8 0xf8f8f8f8f8f8f8f8 r9 0x162a r10 0x835003002d3a64e1 [5674] r11 0 r12 0xf78028cef620 r13 0xfe004086a440 [5674] r14 0xfe01488c0560 r15 0x26f40 rflags 0x10006 [5674] cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b [5674] fsbase 0x95d1d81a130 gsbase 0x84a14000 kgsbase 0 [5674] cpuid = 4; apic id = 08 [5674] panic: double fault [5674] cpuid = 4 [5674] time = 1707498420 [5674] KDB: stack backtrace: [5674] Uptime: 1h34m34s Backtrace: #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 #1 doadump (textdump=) at /usr/src/sys/kern/kern_shutdown.c:405 #2 0x8128b7dc in kern_reboot (howto=howto@entry=260) at /usr/src/sys/kern/kern_shutdown.c:526 #3 0x8128c000 in vpanic ( fmt=fmt@entry=0x82589a00 "double fault", ap=ap@entry=0xfe0040866de0) at /usr/src/sys/kern/kern_shutdown.c:970 #4 0x8128bd75 in panic (fmt=0x82589a00 "double fault") at /usr/src/sys/kern/kern_shutdown.c:894 #5 0x81c4b335 in dblfault_handler (frame=) at /usr/src/sys/amd64/amd64/trap.c:1012 #6 #7 0x812f6e32 in sched_clock (td=td@entry=0xfe01488c0560, cnt=cnt@entry=1) at /usr/src/sys/kern/sched_ule.c:2601 #8 0x8119e2a7 in statclock (cnt=cnt@entry=1, usermode=usermode@entry=0) at /usr/src/sys/kern/kern_clock.c:760 #9 0x8119fb67 in handleevents (now=now@entry=24371855699832, fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:195 #10 0x811a10cc in timercb (et=, arg=) at /usr/src/sys/kern/kern_clocksource.c:353 #11 0x81dcd280 in lapic_handle_timer (frame=0xfe014677b750) at /usr/src/sys/x86/x86/local_apic.c:1343 #12 #13 __asan_load8_noabort (addr=18446741880219689232) at /usr/src/sys/kern/subr_asan.c:1113 #14 0x851488b8 in ?? () from /boot/thayer/ipfw.ko #15 0xfe01 in ?? () #16 0x8134dcd5 in pcpu_find (cpuid=1238425856) at /usr/src/sys/kern/subr_pcpu.c:286 #17 0x85151f6f in ?? () from /boot/thayer/ipfw.ko #18 0x in ?? ()
Re: FreeBSD panics possibly caused by nfs clients
On 2/9/24 11:04 AM, Mark Johnston wrote: > [You don't often get email from ma...@freebsd.org. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > > On Thu, Feb 08, 2024 at 03:34:52PM +, Matthew L. Dailey wrote: >> Good morning all, >> >> Per Rick Macklem's suggestion, I'm posting this query here in the hopes >> that other may have ideas. >> >> We did do some minimal testing with ufs around this problem back in >> August, but hadn't narrowed the issue down to hdf5 workloads yet, so >> testing was inconclusive. We'll do further testing on this to try to >> rule in or out zfs as a contributing factor. >> >> I'm happy to provide more technical details about this issue, but it is >> quite complex to explain and reproduce. > > It sounds like you've so far only tested with release kernels, is that > right? To make some progress and narrow this down a bit, we'd want to > see testing results from a debug kernel. Unfortunately we don't ship > pre-compiled debug kernels with releases, so you'll have to compile your > own, or test a snapshot of the development branch. To do the former, > add the following lines to /usr/src/sys/amd64/conf/GENERIC and follow > the steps here: > https://docs.freebsd.org/en/books/handbook/kernelconfig/#kernelconfig-building > > options DDB > options INVARIANT_SUPPORT > options INVARIANTS > options QUEUE_MACRO_DEBUG_TRASH > options WITNESS > options WITNESS_SKIPSPIN > > Since the problem appears to be some random memory corruption, I'd also > suggest using a KASAN(9) kernel in your test VM. If the root cause of > the crashes is some kind of use-after-free or buffer overflow, KASAN has > a good chance of catching it. Note that both debug kernels and KASAN > kernels have are significantly slower than release kernels, so you'll > want to deploy these in your test VM. Once you do, and a panic occurs, > share the panic message and backtrace here, and we'll have a better idea > of where to start. > Hi Mark, Thanks for your response. This is mostly right - the bulk of our testing has been on GENERIC, with the KGSSAPI option added for kerberized nfs. I've had some out-of-band discussions where I learned about KASAN - I built a 14.0p4 kernel this morning with this and started a test a little while ago. Based on your suggestions, I'll also build a debug kernel so I can do some parallel testing with this. I also still need to do more testing to rule in or out zfs as Rick suggested. In our experience, these panics can take hours or days to manifest (our high bar so far is 176 hours!), so it may take a while to get results. :-) I'll post more here as there's anything to report. Best, Matt >> Thanks in advance for any help! >> >> Best, >> Matt >> >> On 2/7/24 6:10 PM, Rick Macklem wrote: >> > >> > Well, there is certainly no obvious answer. >> > One thing I would suggest is setting up a test >> > server using UFS and see it if panics. >> > >> > To be honest, NFS is pretty simple when it comes >> > to server side reading/writiing. It basically translates >> > NFS RPCs to VOP calls on the underlying file system. >> > As such, issues are usually either network fabric on one side >> > (everything from cables to NIC drives and the TCP stack). >> > Since you are seeing this with assorted NICs and FreeBSD >> > versions, I doubt it is network fabric related, but?? >> > This is why I'd suspect the ZFS side and trying UFS would >> > isolate the problem to ZFS. >> > >> > Although I know nothing about hdf5 files, the NFS code does >> > nothing with the data (it is just a byte stream), >> > Since ZFS does normally do things like compression, it could be >> > affected by the data contents. >> > >> > Another common (although less common now) problem is TSO >> > when data of certain sizes (often just less than 64K) is handled. >> > However, this usually results in hangs and I have never heard >> > of memory data corruption. >> > >> > Bottom line..determining if it is ZFS specific would be the best >> > first step, I think? >> > >> > It would be good to post this to a mailing list like freebsd-current@, >> > since others might have some insite into this. (Although you are >> > not using freebsd-current@, that is where most developers read >> > email.) >> > >> > rick >> > >> > >>
FreeBSD panics possibly caused by nfs clients
Good morning all, Per Rick Macklem's suggestion, I'm posting this query here in the hopes that other may have ideas. We did do some minimal testing with ufs around this problem back in August, but hadn't narrowed the issue down to hdf5 workloads yet, so testing was inconclusive. We'll do further testing on this to try to rule in or out zfs as a contributing factor. I'm happy to provide more technical details about this issue, but it is quite complex to explain and reproduce. Thanks in advance for any help! Best, Matt On 2/7/24 6:10 PM, Rick Macklem wrote: > > Well, there is certainly no obvious answer. > One thing I would suggest is setting up a test > server using UFS and see it if panics. > > To be honest, NFS is pretty simple when it comes > to server side reading/writiing. It basically translates > NFS RPCs to VOP calls on the underlying file system. > As such, issues are usually either network fabric on one side > (everything from cables to NIC drives and the TCP stack). > Since you are seeing this with assorted NICs and FreeBSD > versions, I doubt it is network fabric related, but?? > This is why I'd suspect the ZFS side and trying UFS would > isolate the problem to ZFS. > > Although I know nothing about hdf5 files, the NFS code does > nothing with the data (it is just a byte stream), > Since ZFS does normally do things like compression, it could be > affected by the data contents. > > Another common (although less common now) problem is TSO > when data of certain sizes (often just less than 64K) is handled. > However, this usually results in hangs and I have never heard > of memory data corruption. > > Bottom line..determining if it is ZFS specific would be the best > first step, I think? > > It would be good to post this to a mailing list like freebsd-current@, > since others might have some insite into this. (Although you are > not using freebsd-current@, that is where most developers read > email.) > > rick > > > On Wed, Feb 7, 2024 at 10:50 AM Matthew L. Dailey > wrote: >> >> >> Hi Rick, >> >> My name is Matt Dailey, and (among many other things), I run a FreeBSD >> file server for the Thayer School of Engineering and the Department of >> Computer Science here at Dartmouth College. We have run into a very odd >> issue in which nfs4 Linux clients using hdf5 files are corrupting memory >> and causing kernel panics on our FreeBSD server. The issue is very >> complex to describe, and despite our diligent efforts, we have not been >> able to replicate it in a simple scenario to report to FreeBSD >> developers. In advance of filing an official bug report, I’m reaching >> out in the hopes of having a discussion to get your guidance about how >> best to proceed. >> >> The quick background is that we’ve been running a FreeBSD file server, >> serving files from a zfs filesystem over kerberized nfs4 and samba for >> almost 11 years, through 3 different generations of hardware and from >> FreeBSD 9.1 up through 13.2. This system has historically been >> wonderfully stable and robust. >> >> Beginning late in 2022, and then more regularly beginning in July of >> 2023, we started experiencing kernel panics on our current system, then >> running FreeBSD 13.0. They were seemingly random (mostly trap 12 and >> trap 9) in random kernel functions, so we initially blamed hardware. We >> replaced all RAM, moved to backup hardware, and even moved to an older, >> retired system and the panics persisted. We have also upgraded from 13.0 >> to 13.2 and are currently at 13.2p5. >> >> After months of investigation, we finally narrowed down that these >> panics were being caused by software on our Linux clients writing hdf5 >> files over nfs to the FreeBSD server. As near as we can tell from poring >> through core dumps, something about how this nfs traffic is being >> processed is corrupting kernel memory and then eventually a panic is >> happening when some unsuspecting function reads the corrupted memory. >> Since we have eliminated most known hdf5 workloads on our production >> system, the panics have mostly ceased, and we suspect that the remaining >> crashes could be from users still using hdf5. >> >> We have reproduced this issue with both krb5 and sys mounts, and up >> through 13.2p9 and 14.0p4. All our testing has been using nfs 4.1. >> Depending on conditions, panics sometimes occur within an hour or two, >> or sometimes can take several days. On 14.0, the panics seems much less >> prevalent, but still exist. With 13.x, i