Re: Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations

2024-08-21 Thread Matthew L. Dailey
Hi Rick,

Done - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280978

Thanks!

-Matt

On 8/21/24 10:45 AM, Rick Macklem wrote:
> Please create a PR for this and include at least
> one backtrace. I will try and figure out how
> locallocks could cause it.
> 
> I suspect few use locallocks=1.
> 
> rick
> 
> On Wed, Aug 21, 2024 at 7:29 AM Matthew L. Dailey 
> mailto:matthew.l.dai...@dartmouth.edu>> 
> wrote:
> 
> Hi all,
> 
> I posted messages to the this list back in February and March
> 
> (https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html 
> <https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html>)
> regarding kernel panics we were having with nfs clients doing hdf5 file
> operations. After a hiatus in troubleshooting, I had more time this
> summer and have found the cause - the vfs.nfsd.enable_locallocks sysctl.
> 
> When this is set to 1, we can induce either a panic or hung nfs server
> (more rarely) usually within a few hours, but sometimes within several
> days to a week. We have replicated this on 13.0 through 15.0-CURRENT
> (20240725-82283cad12a4-271360). With this set to 0 (default), we are
> unable to replicate the issue, even after several weeks of 24/7 hdf5
> file operations.
> 
> One other side-effect of these panics is that on a few occasions it has
> corrupted the root zpool beyond repair. This makes sense since kernel
> memory is getting corrupted, but obviously makes this issue more
> impactful.
> 
> I'm hoping this is enough information to start narrowing down this
> issue. We are specifically using this sysctl because we are also
> serving
> files via samba and want to ensure consistent locking.
> 
> I have provided some core dumps and backtraces previously, but am happy
> to provide more as needed. I also have a writeup of exactly how to
> reproduce this that I can send directly to anyone who is interested.
> 
> Thanks so much for any and all help with this tricky problem. I'm happy
> to do whatever I can to help get this squashed.
> 
> Best,
> Matt
> 


Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations

2024-08-21 Thread Matthew L. Dailey
Hi all,

I posted messages to the this list back in February and March 
(https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html) 
regarding kernel panics we were having with nfs clients doing hdf5 file 
operations. After a hiatus in troubleshooting, I had more time this 
summer and have found the cause - the vfs.nfsd.enable_locallocks sysctl.

When this is set to 1, we can induce either a panic or hung nfs server 
(more rarely) usually within a few hours, but sometimes within several 
days to a week. We have replicated this on 13.0 through 15.0-CURRENT 
(20240725-82283cad12a4-271360). With this set to 0 (default), we are 
unable to replicate the issue, even after several weeks of 24/7 hdf5 
file operations.

One other side-effect of these panics is that on a few occasions it has 
corrupted the root zpool beyond repair. This makes sense since kernel 
memory is getting corrupted, but obviously makes this issue more impactful.

I'm hoping this is enough information to start narrowing down this 
issue. We are specifically using this sysctl because we are also serving 
files via samba and want to ensure consistent locking.

I have provided some core dumps and backtraces previously, but am happy 
to provide more as needed. I also have a writeup of exactly how to 
reproduce this that I can send directly to anyone who is interested.

Thanks so much for any and all help with this tricky problem. I'm happy 
to do whatever I can to help get this squashed.

Best,
Matt


Re: FreeBSD panics possibly caused by nfs clients

2024-03-06 Thread Matthew L. Dailey
Posting a few updates on this issue.

I was able to induce a panic on a CURRENT kernel (20240215), built with 
GENERIC-KASAN and running kern.kstack_pages=6 (default) after ~189 
hours. The panic message and backtrace are below - please reach out 
directly if you'd like to have a look at the core. I'm specifically not 
increasing kstack_pages to see what effect this has on the panics.

I have another system running CURRENT (20240215) without any debugging. 
I'm able to regularly panic this (7 panics over two weeks with average 
uptime of ~42 hours) with kstack_pages=4. I set kstack_pages=6, and have 
also induced several panics. Oddly, this seems to happen more quickly (4 
panics over 2 days with average uptime of ~10.5 hours).

Another system running CURRENT (20240208), built with GENERIC-KASAN and 
running kern.kstack_pages=8 has now been running with our hdf5 workload 
non-stop since February 10th with no panics or issues.

 From all this, it seems like increasing kstack_pages to 8 eliminates 
the panics, but obviously doesn't fix the underlying cause of the 
issues. So, although this is an obvious workaround for our production 
system, it would be better to find and fix the underlying cause of the 
panics.

I'm going to continue testing to try to generate more cores with 
kstack_pages<8 on the system with the debug kernel.

Any other ideas to try to narrow down the cause are welcome.

Thanks,
Matt

[680940] Kernel page fault with the following non-sleepable locks held:
[680940] exclusive sleep mutex nfs_state_mutex (nfs_state_mutex) r = 0 
(0x830498e0) locked @ /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:6652
[680940] stack backtrace:
[680940] #0 0x8127958f at witness_debugger+0x13f
[680940] #1 0x8127b114 at witness_warn+0x674
[680940] #2 0x81aba0a6 at trap_pfault+0x116
[680940] #3 0x81ab901c at trap+0x54c
[680940] #4 0x81a75988 at calltrap+0x8
[680940] #5 0x80fb4bfa at nfsrv_freestateid+0x23a
[680940] #6 0x80fd5e3f at nfsrvd_freestateid+0x1df
[680940] #7 0x80f98d35 at nfsrvd_dorpc+0x2585
[680940] #8 0x80fbf588 at nfssvc_program+0x1078
[680940] #9 0x8173fce6 at svc_run_internal+0x1706
[680940] #10 0x8174094b at svc_thread_start+0xb
[680940] #11 0x811137a3 at fork_exit+0xa3
[680940] #12 0x81a769ee at fork_trampoline+0xe
[680940]
[680940]
[680940] Fatal trap 12: page fault while in kernel mode
[680940] cpuid = 3; apic id = 06
[680940] fault virtual address  = 0x7
[680940] fault code = supervisor read data, page not present
[680940] instruction pointer= 0x20:0x80fafd67
[680940] stack pointer  = 0x28:0xfe0153ba2de0
[680940] frame pointer  = 0x28:0xfe0153ba2eb0
[680940] code segment   = base 0x0, limit 0xf, type 0x1b
[680940]= DPL 0, pres 1, long 1, def32 0, gran 1
[680940] processor eflags   = interrupt enabled, resume, IOPL = 0
[680940] current process= 55202 (nfsd: service)
[680940] rdi: 0007 rsi:  rdx: d7c0
[680940] rcx: fe001b9ec1e8  r8: 0012c4350002  r9: 0012c4350002
[680940] rax: fe001b9ec1e8 rbx:  rbp: fe0153ba2eb0
[680940] r10: 0004 r11: 0006 r12: 0007
[680940] r13: fe019cd75700 r14: 1a1a r15: fe019cd75708
[680940] trap number= 12
[680940] panic: page fault
[680940] cpuid = 3
[680940] time = 1709646178
[680940] KDB: stack backtrace:
[680940] db_trace_self_wrapper() at db_trace_self_wrapper+0xa5/frame 
0xfe0153ba2550
[680940] kdb_backtrace() at kdb_backtrace+0xc6/frame 0xfe0153ba26b0
[680940] vpanic() at vpanic+0x210/frame 0xfe0153ba2850
[680940] panic() at panic+0xb5/frame 0xfe0153ba2910
[680940] trap_fatal() at trap_fatal+0x65b/frame 0xfe0153ba2a10
[680940] trap_pfault() at trap_pfault+0x12b/frame 0xfe0153ba2b30
[680940] trap() at trap+0x54c/frame 0xfe0153ba2d10
[680940] calltrap() at calltrap+0x8/frame 0xfe0153ba2d10
[680940] --- trap 0xc, rip = 0x80fafd67, rsp = 
0xfe0153ba2de0, rbp = 0xfe0153ba2eb0 ---
[680940] nfsrv_freelockowner() at nfsrv_freelockowner+0x97/frame 
0xfe0153ba2eb0
[680940] nfsrv_freestateid() at nfsrv_freestateid+0x23a/frame 
0xfe0153ba2f70
[680940] nfsrvd_freestateid() at nfsrvd_freestateid+0x1df/frame 
0xfe0153ba3030
[680940] nfsrvd_dorpc() at nfsrvd_dorpc+0x2585/frame 0xfe0153ba3570
[680940] nfssvc_program() at nfssvc_program+0x1078/frame 0xfe0153ba3970
[680940] svc_run_internal() at svc_run_internal+0x1706/frame 
0xfe0153ba3ee0
[680940] svc_thread_start() at svc_thread_start+0xb/frame 0xfe0153ba3ef0
[680940] fork_exit() at fork_exit+0xa3/frame 0xfe0153ba3f30
[680940] fork_trampoline() at fork_trampoline+0xe/frame 0xfe0153ba3f30
[680940] --- trap 0xc, rip = 0x3b4ff896f0da, rsp = 0x3b4ff6a500e8, rbp = 
0x3b4ff6a50380 ---
[680940] KDB: enter:

Re: FreeBSD panics possibly caused by nfs clients

2024-02-20 Thread Matthew L. Dailey
Hi all,

I induced a panic on my CURRENT (20240215-d79b6b8ec267-268300) VM after 
about 24 hours. This is the one without any debugging, so it only 
confirms the fact that the panics we've been experiencing still exist in 
CURRENT. There was some disk issue that prevented the dump, so all I 
have is the panic, pasted below.

The two test systems with full debugging are still running after a week 
and a half.

> You might want to set
> kern.kstack_pages=6
> in /boot/loader.conf in these setups.
> 
> I would normally expect double faults when a kernel stack is blown,
> but maybe there is a reason that you do now see that for a blown kernel
> stack. (The impact of increasing stack pages from 4->6 should be minimal.)
> 
> rick
Rick - I'm a little confused by the kstack_pages tunable and just want 
to clarify. Are you proposing that this might solve the panic issues 
we've been having, or that it will make the panics/dumps more useful by 
avoiding false positives? We've only ever seen that "double fault" once 
in over 100 observed panics, and that was only when we enabled just 
KASAN on a 14.0p4 system.

-Matt


[85751] Fatal trap 12: page fault while in kernel mode
[85751] cpuid = 3; apic id = 06
[85751] fault virtual address  = 0x4f0f760
[85751] fault code = supervisor read data, page not present
[85751] instruction pointer= 0x20:0x820022f7
[85751] stack pointer  = 0x28:0xfe010bdf8d50
[85751] frame pointer  = 0x28:0xfe010bdf8d80
[85751] code segment   = base 0x0, limit 0xf, type 0x1b
[85751]= DPL 0, pres 1, long 1, def32 0, gran 1
[85751] processor eflags   = interrupt enabled, resume, IOPL = 0
[85751] current process= 0 (z_wr_int_h_3)
[85751] rdi: f802d1036900 rsi: f80416887300 rdx: f80416887380
[85751] rcx: f802d1036908  r8: 0100  r9: 8013070f000700ff
[85751] rax: 04f0f748 rbx: f802d1036900 rbp: fe010bdf8d80
[85751] r10: f80412c4f708 r11:  r12: f8000944ed58
[85751] r13:  r14: 04f0f748 r15: fe010caa9438
[85751] trap number= 12
[85751] panic: page fault
[85751] cpuid = 3
[85751] time = 1708451091
[85751] KDB: stack backtrace:
[85751] #0 0x80b9803d at kdb_backtrace+0x5d
[85751] #1 0x80b4a8d5 at vpanic+0x135
[85751] #2 0x80b4a793 at panic+0x43
[85751] #3 0x81026b8f at trap_fatal+0x40f
[85751] #4 0x81026bdf at trap_pfault+0x4f
[85751] #5 0x80ffd9f8 at calltrap+0x8
[85751] #6 0x81fea83b at dmu_sync_late_arrival_done+0x6b
[85751] #7 0x8214a78e at zio_done+0xc6e
[85751] #8 0x821442cc at zio_execute+0x3c
[85751] #9 0x80bae402 at taskqueue_run_locked+0x182
[85751] #10 0x80baf692 at taskqueue_thread_loop+0xc2
[85751] #11 0x80b0484f at fork_exit+0x7f
[85751] #12 0x80ffea5e at fork_trampoline+0xe
[85751] Uptime: 23h49m11s


Re: FreeBSD panics possibly caused by nfs clients

2024-02-19 Thread Matthew L. Dailey
Hi all,

So I finally induced a panic on a "pure" ufs system - root and exported 
filesystem were both ufs. So, I think this definitively rules out zfs as 
a source of the issue.

This panic was on 14.0p5 without debugging options, so the core may not 
be helpful. The panic and backtrace are below in case they're 
interesting to anyone.

Next, I'm going to try a CURRENT kernel without debugging options 
enabled just to see if I can finally induce a panic here. My other two 
VMs running CURRENT with full debugging are still clanking along.

-Matt

[218716] Fatal trap 12: page fault while in kernel mode
[218716] cpuid = 4; apic id = 08
[218716] fault virtual address  = 0x10017
[218716] fault code = supervisor read data, page not present
[218716] instruction pointer= 0x20:0x80e9165d
[218716] stack pointer  = 0x28:0xfe010b5aa3b0
[218716] frame pointer  = 0x28:0xfe010b5aa400
[218716] code segment   = base 0x0, limit 0xf, type 0x1b
[218716]= DPL 0, pres 1, long 1, def32 0, gran 1
[218716] processor eflags   = interrupt enabled, resume, IOPL = 0
[218716] current process= 49575 (nfsd: service)
[218716] rdi:  rsi: f800038ec900 rdx: fe00d9326000
[218716] rcx: 00030eb0  r8:   r9: fe010b5aa410
[218716] rax: 008f0eb0 rbx: f8038ac4cd00 rbp: fe010b5aa400
[218716] r10:  r11:  r12: 
[218716] r13: f80003647c00 r14: f802f9dced00 r15: f800038ec900
[218716] trap number= 12
[218716] panic: page fault
[218716] cpuid = 4
[218716] time = 1708319487
[218716] KDB: stack backtrace:
[218716] #0 0x80b9309d at kdb_backtrace+0x5d
[218716] #1 0x80b461a2 at vpanic+0x132
[218716] #2 0x80b46063 at panic+0x43
[218716] #3 0x8101d85c at trap_fatal+0x40c
[218716] #4 0x8101d8af at trap_pfault+0x4f
[218716] #5 0x80ff3fe8 at calltrap+0x8
[218716] #6 0x80e8716e at newdirrem+0x8be
[218716] #7 0x80e866fa at softdep_setup_remove+0x1a
[218716] #8 0x80ea71af at ufs_dirremove+0x21f
[218716] #9 0x80ead4f4 at ufs_remove+0xb4
[218716] #10 0x810f1428 at VOP_REMOVE_APV+0x28
[218716] #11 0x80a60db4 at nfsvno_removesub+0xc4
[218716] #12 0x80a52699 at nfsrvd_remove+0x1b9
[218716] #13 0x80a374d4 at nfsrvd_dorpc+0x1854
[218716] #14 0x80a4e76f at nfssvc_program+0x82f
[218716] #15 0x80e34080 at svc_run_internal+0xb50
[218716] #16 0x80e3475b at svc_thread_start+0xb
[218716] #17 0x80b00b7f at fork_exit+0x7f
[218716] Uptime: 2d12h45m16s
[218716] Dumping 985 out of 16350 
MB:..2%..12%..22%..31%..41%..51%..61%..72%..82%..91%


#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=) at 
/usr/src/sys/kern/kern_shutdown.c:405
#2  0x80b45d37 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:526
#3  0x80b4620f in vpanic (fmt=0x81147c9c "%s",
 ap=ap@entry=0xfe010b5aa200) at 
/usr/src/sys/kern/kern_shutdown.c:970
#4  0x80b46063 in panic (fmt=)
 at /usr/src/sys/kern/kern_shutdown.c:894
#5  0x8101d85c in trap_fatal (frame=0xfe010b5aa2f0, 
eva=4294967319)
 at /usr/src/sys/amd64/amd64/trap.c:952
#6  0x8101d8af in trap_pfault (frame=0xfe010b5aa2f0,
 usermode=false, signo=, ucode=)
 at /usr/src/sys/amd64/amd64/trap.c:760
#7  
#8  cancel_diradd (dap=0xf8038ac4cd00,
 dirrem=dirrem@entry=0xf800038ec900,
 jremref=jremref@entry=0xf802f9dced00, 
dotremref=dotremref@entry=0x0,
 dotdotremref=dotdotremref@entry=0x0)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:9028
#9  0x80e8716e in newdirrem (bp=,
 dp=dp@entry=0xf800037fea80, ip=ip@entry=0xf8006b3b9300,
 isrmdir=isrmdir@entry=0, 
prevdirremp=prevdirremp@entry=0xfe010b5aa4b0)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:9480
#10 0x80e866fa in softdep_setup_remove (bp=0x,
 dp=0xf800038ec900, dp@entry=0xf800037fea80, 
ip=0xfe00d9326000,
 ip@entry=0xf8006b3b9300, isrmdir=200368, isrmdir@entry=0)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:9176
#11 0x80ea71af in ufs_dirremove (dvp=dvp@entry=0xf801f764be00,
 ip=ip@entry=0xf8006b3b9300, flags=,
 isrmdir=isrmdir@entry=0) at /usr/src/sys/ufs/ufs/ufs_lookup.c:1198
#12 0x80ead4f4 in ufs_remove (ap=0xfe010b5aa5d8)
 at /usr/src/sys/ufs/ufs/ufs_vnops.c:1054
#13 0x810f1428 in VOP_REMOVE_APV (
 vop=0x8172f2d0 , a=a@entry=0xfe010b5aa5d8)
 at vnode_if.c:1534
#14 0x80a60db4 in VOP_REMOVE (dvp=0x8f0eb0, vp=0xf800539b7380,
 cnp=0x30eb0) at ./vnode_if.h:789
#15 nfsvno_removesub (ndp=0xfe010b5aa858, is_v4=,
 cred=, p=p@entry=0xfe010ae803a0,
 exp=exp@entry=0xfe010b5aaa88)
 at /usr/src/sys/fs/nfsserver/nfs_nfsdpor

Re: FreeBSD panics possibly caused by nfs clients

2024-02-16 Thread Matthew L. Dailey
Hi all,

Before the week was out, I wanted to provide an update on this issue.

Last weekend, I installed two VMs with CURRENT
(20240208-82bebc793658-268105) - one on zfs and one on ufs - and built a 
kernel with this config file:
include GENERIC
ident   THAYER-FULLDEBUG
makeoptions DEBUG=-g
options KASAN
options DDB
options INVARIANT_SUPPORT
options INVARIANTS
options QUEUE_MACRO_DEBUG_TRASH
options WITNESS
options WITNESS_SKIPSPIN
options KGSSAPI

I'm also setting these in loader.conf:
debug.witness.watch=1
debug.witness.kdb=1
kern.kstack_pages=8

These two VMs have been running non-stop with our hdf5 workload without 
a panic for 146 hours and 122 hours, respectively. This might be good 
news, but is well within the threshold we've seen in our testing over 
the past 6 months. Given that all the debug kernel options slow things 
down significantly, these could just be taking a long while to panic.

I also have a another VM with our "standard" 14.0p5 kernel (GENERIC with 
KGSSAPI enabled) running on ufs to try to rule in or out zfs. This 
failed this morning, but not with a panic. In this case, nfs stopped 
responding. This is a failure mode we have seen in our testing, but is 
much rarer than a full panic. I intend to continue testing this to try 
to induce a panic, at which point I think we can rule out zfs as a 
potential cause.

Just so it's documented, since I started experimenting with kernel debug 
options last week, I have so far induced panics with the following:
- 13.2p9 kernel on hardware (only WITNESS enabled)
- 14.0p4 kernel on VM (only KASAN enabled)
- 13.2p9 kernel on hardware (all debug options above except KASAN)

My plan right now is to continue running my two test VMs with CURRENT to 
see if it's just taking a long time to panic. Once I have finished my 
ufs testing on the third VM, I will build a GENERIC kernel for CURRENT 
(no debug options, only KGSSAPI) and test against that to see if the 
actual debug instrumentation is interfering with reproducing this issue.

Please reach out if you have ideas or suggestions. I'll provide updates 
here when I have them.

Thanks,
Matt


Re: FreeBSD panics possibly caused by nfs clients

2024-02-09 Thread Matthew L. Dailey


On 2/9/24 4:18 PM, Mark Johnston wrote:
> [You don't often get email from ma...@freebsd.org. Learn why this is 
> important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> On Fri, Feb 09, 2024 at 06:23:08PM +, Matthew L. Dailey wrote:
>> I had my first kernel panic with a KASAN kernel after only 01:27. This
>> first panic was a "double fault," which isn't anything we've seen
>> previously - usually we've seen trap 9 or trap 12, but sometimes others.
>> Based on the backtrace, it definitely looks like KASAN caught something,
>> but I don't have the expertise to know if this points to anything
>> specific. From the backtrace, it looks like this might have originated
>> in ipfw code.
> 
> A double fault is rather unexpected.  I presume you're running
> releng/14.0?  Is it at all possible to test with FreeBSD-CURRENT?

This is just 14.0-RELEASE, updated to p4. Haven't played with CURRENT 
before, but I'll give it a shot.

> 
> Did you add INVARIANTS etc. to the kernel configuration used here, or
> just KASAN?
> 

This just had KASAN. Are you suggesting that I only add INVARIANTS 
(along with KASAN), or all the various debug bits:
options DDB
options INVARIANT_SUPPORT
options INVARIANTS
options QUEUE_MACRO_DEBUG_TRASH
options WITNESS
options WITNESS_SKIPSPIN

>> Please let me know what other info I can provide or what I can do to dig
>> deeper.
> 
> If you could repeat the test several times, I'd be interested in seeing
> if you always get the same result.  If you're willing to share the
> vmcore (or several), I'd be willing to take a look at it.
> 

We'll see how things go with testing, and I'll pass along cores directly 
if it makes sense. There shouldn't be anything sensitive that we care about.

>> Thanks!!
>>
>> Panic message:
>> [5674] Fatal double fault
>> [5674] rip 0x812f6e32 rsp 0xfe014677afe0 rbp 0xfe014677b430
>> [5674] rax 0x1fc028cef620 rdx 0xf2f2f2f8f2f2f2f2 rbx 0x1
>> [5674] rcx 0xd7c0 rsi 0xfe004086a4a0 rdi 0xf8f8f8f8f2f2f2f8
>> [5674] r8 0xf8f8f8f8f8f8f8f8 r9 0x162a r10 0x835003002d3a64e1
>> [5674] r11 0 r12 0xf78028cef620 r13 0xfe004086a440
>> [5674] r14 0xfe01488c0560 r15 0x26f40 rflags 0x10006
>> [5674] cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
>> [5674] fsbase 0x95d1d81a130 gsbase 0x84a14000 kgsbase 0
>> [5674] cpuid = 4; apic id = 08
>> [5674] panic: double fault
>> [5674] cpuid = 4
>> [5674] time = 1707498420
>> [5674] KDB: stack backtrace:
>> [5674] Uptime: 1h34m34s
>>
>> Backtrace:
>> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
>> #1  doadump (textdump=) at
>> /usr/src/sys/kern/kern_shutdown.c:405
>> #2  0x8128b7dc in kern_reboot (howto=howto@entry=260)
>>   at /usr/src/sys/kern/kern_shutdown.c:526
>> #3  0x8128c000 in vpanic (
>>   fmt=fmt@entry=0x82589a00  "double fault",
>>   ap=ap@entry=0xfe0040866de0) at
>> /usr/src/sys/kern/kern_shutdown.c:970
>> #4  0x8128bd75 in panic (fmt=0x82589a00  "double
>> fault")
>>   at /usr/src/sys/kern/kern_shutdown.c:894
>> #5  0x81c4b335 in dblfault_handler (frame=)
>>   at /usr/src/sys/amd64/amd64/trap.c:1012
>> #6  
>> #7  0x812f6e32 in sched_clock (td=td@entry=0xfe01488c0560,
>>   cnt=cnt@entry=1) at /usr/src/sys/kern/sched_ule.c:2601
>> #8  0x8119e2a7 in statclock (cnt=cnt@entry=1,
>>   usermode=usermode@entry=0) at /usr/src/sys/kern/kern_clock.c:760
>> #9  0x8119fb67 in handleevents (now=now@entry=24371855699832,
>>   fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:195
>> #10 0x811a10cc in timercb (et=, arg=)
>>   at /usr/src/sys/kern/kern_clocksource.c:353
>> #11 0x81dcd280 in lapic_handle_timer (frame=0xfe014677b750)
>>   at /usr/src/sys/x86/x86/local_apic.c:1343
>> #12 
>> #13 __asan_load8_noabort (addr=18446741880219689232)
>>   at /usr/src/sys/kern/subr_asan.c:1113
>> #14 0x851488b8 in ?? () from /boot/thayer/ipfw.ko
>> #15 0xfe01 in ?? ()
>> #16 0x8134dcd5 in pcpu_find (cpuid=1238425856)
>>   at /usr/src/sys/kern/subr_pcpu.c:286
>> #17 0x85151f6f in ?? () from /boot/thayer/ipfw.ko
>> #18 0x in ?? ()


Re: FreeBSD panics possibly caused by nfs clients

2024-02-09 Thread Matthew L. Dailey
I had my first kernel panic with a KASAN kernel after only 01:27. This 
first panic was a "double fault," which isn't anything we've seen 
previously - usually we've seen trap 9 or trap 12, but sometimes others. 
Based on the backtrace, it definitely looks like KASAN caught something, 
but I don't have the expertise to know if this points to anything 
specific. From the backtrace, it looks like this might have originated 
in ipfw code.

Please let me know what other info I can provide or what I can do to dig 
deeper.

Thanks!!

Panic message:
[5674] Fatal double fault
[5674] rip 0x812f6e32 rsp 0xfe014677afe0 rbp 0xfe014677b430
[5674] rax 0x1fc028cef620 rdx 0xf2f2f2f8f2f2f2f2 rbx 0x1
[5674] rcx 0xd7c0 rsi 0xfe004086a4a0 rdi 0xf8f8f8f8f2f2f2f8
[5674] r8 0xf8f8f8f8f8f8f8f8 r9 0x162a r10 0x835003002d3a64e1
[5674] r11 0 r12 0xf78028cef620 r13 0xfe004086a440
[5674] r14 0xfe01488c0560 r15 0x26f40 rflags 0x10006
[5674] cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
[5674] fsbase 0x95d1d81a130 gsbase 0x84a14000 kgsbase 0
[5674] cpuid = 4; apic id = 08
[5674] panic: double fault
[5674] cpuid = 4
[5674] time = 1707498420
[5674] KDB: stack backtrace:
[5674] Uptime: 1h34m34s

Backtrace:
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=) at 
/usr/src/sys/kern/kern_shutdown.c:405
#2  0x8128b7dc in kern_reboot (howto=howto@entry=260)
 at /usr/src/sys/kern/kern_shutdown.c:526
#3  0x8128c000 in vpanic (
 fmt=fmt@entry=0x82589a00  "double fault",
 ap=ap@entry=0xfe0040866de0) at 
/usr/src/sys/kern/kern_shutdown.c:970
#4  0x8128bd75 in panic (fmt=0x82589a00  "double 
fault")
 at /usr/src/sys/kern/kern_shutdown.c:894
#5  0x81c4b335 in dblfault_handler (frame=)
 at /usr/src/sys/amd64/amd64/trap.c:1012
#6  
#7  0x812f6e32 in sched_clock (td=td@entry=0xfe01488c0560,
 cnt=cnt@entry=1) at /usr/src/sys/kern/sched_ule.c:2601
#8  0x8119e2a7 in statclock (cnt=cnt@entry=1,
 usermode=usermode@entry=0) at /usr/src/sys/kern/kern_clock.c:760
#9  0x8119fb67 in handleevents (now=now@entry=24371855699832,
 fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:195
#10 0x811a10cc in timercb (et=, arg=)
 at /usr/src/sys/kern/kern_clocksource.c:353
#11 0x81dcd280 in lapic_handle_timer (frame=0xfe014677b750)
 at /usr/src/sys/x86/x86/local_apic.c:1343
#12 
#13 __asan_load8_noabort (addr=18446741880219689232)
 at /usr/src/sys/kern/subr_asan.c:1113
#14 0x851488b8 in ?? () from /boot/thayer/ipfw.ko
#15 0xfe01 in ?? ()
#16 0x8134dcd5 in pcpu_find (cpuid=1238425856)
 at /usr/src/sys/kern/subr_pcpu.c:286
#17 0x85151f6f in ?? () from /boot/thayer/ipfw.ko
#18 0x in ?? ()


Re: FreeBSD panics possibly caused by nfs clients

2024-02-09 Thread Matthew L. Dailey

On 2/9/24 11:04 AM, Mark Johnston wrote:
> [You don't often get email from ma...@freebsd.org. Learn why this is 
> important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> On Thu, Feb 08, 2024 at 03:34:52PM +, Matthew L. Dailey wrote:
>> Good morning all,
>>
>> Per Rick Macklem's suggestion, I'm posting this query here in the hopes
>> that other may have ideas.
>>
>> We did do some minimal testing with ufs around this problem back in
>> August, but hadn't narrowed the issue down to hdf5 workloads yet, so
>> testing was inconclusive. We'll do further testing on this to try to
>> rule in or out zfs as a contributing factor.
>>
>> I'm happy to provide more technical details about this issue, but it is
>> quite complex to explain and reproduce.
> 
> It sounds like you've so far only tested with release kernels, is that
> right?  To make some progress and narrow this down a bit, we'd want to
> see testing results from a debug kernel.  Unfortunately we don't ship
> pre-compiled debug kernels with releases, so you'll have to compile your
> own, or test a snapshot of the development branch.  To do the former,
> add the following lines to /usr/src/sys/amd64/conf/GENERIC and follow
> the steps here: 
> https://docs.freebsd.org/en/books/handbook/kernelconfig/#kernelconfig-building
> 
> options DDB
> options INVARIANT_SUPPORT
> options INVARIANTS
> options QUEUE_MACRO_DEBUG_TRASH
> options WITNESS
> options WITNESS_SKIPSPIN
> 
> Since the problem appears to be some random memory corruption, I'd also
> suggest using a KASAN(9) kernel in your test VM.  If the root cause of
> the crashes is some kind of use-after-free or buffer overflow, KASAN has
> a good chance of catching it.  Note that both debug kernels and KASAN
> kernels have are significantly slower than release kernels, so you'll
> want to deploy these in your test VM.  Once you do, and a panic occurs,
> share the panic message and backtrace here, and we'll have a better idea
> of where to start.
> 

Hi Mark,

Thanks for your response. This is mostly right - the bulk of our testing 
has been on GENERIC, with the KGSSAPI option added for kerberized nfs.

I've had some out-of-band discussions where I learned about KASAN - I 
built a 14.0p4 kernel this morning with this and started a test a little 
while ago.

Based on your suggestions, I'll also build a debug kernel so I can do 
some parallel testing with this. I also still need to do more testing to 
rule in or out zfs as Rick suggested.

In our experience, these panics can take hours or days to manifest (our 
high bar so far is 176 hours!), so it may take a while to get results. :-)

I'll post more here as there's anything to report.

Best,
Matt

>> Thanks in advance for any help!
>>
>> Best,
>> Matt
>>
>> On 2/7/24 6:10 PM, Rick Macklem wrote:
>>   >
>>   > Well, there is certainly no obvious answer.
>>   > One thing I would suggest is setting up a test
>>   > server using UFS and see it if panics.
>>   >
>>   > To be honest, NFS is pretty simple when it comes
>>   > to server side reading/writiing. It basically translates
>>   > NFS RPCs to VOP calls on the underlying file system.
>>   > As such, issues are usually either network fabric on one side
>>   > (everything from cables to NIC drives and the TCP stack).
>>   > Since you are seeing this with assorted NICs and FreeBSD
>>   > versions, I doubt it is network fabric related, but??
>>   > This is why I'd suspect the ZFS side and trying UFS would
>>   > isolate the problem to ZFS.
>>   >
>>   > Although I know nothing about hdf5 files, the NFS code does
>>   > nothing with the data (it is just a byte stream),
>>   > Since ZFS does normally do things like compression, it could be
>>   > affected by the data contents.
>>   >
>>   > Another common (although less common now) problem is TSO
>>   > when data of certain sizes (often just less than 64K) is handled.
>>   > However, this usually results in hangs and I have never heard
>>   > of memory data corruption.
>>   >
>>   > Bottom line..determining if it is ZFS specific would be the best
>>   > first step, I think?
>>   >
>>   > It would be good to post this to a mailing list like freebsd-current@,
>>   > since others might have some insite into this. (Although you are
>>   > not using freebsd-current@, that is where most developers read
>>   > email.)
>>   >
>>   > rick
>>   >
>>   >
>>

FreeBSD panics possibly caused by nfs clients

2024-02-08 Thread Matthew L. Dailey
Good morning all,

Per Rick Macklem's suggestion, I'm posting this query here in the hopes 
that other may have ideas.

We did do some minimal testing with ufs around this problem back in 
August, but hadn't narrowed the issue down to hdf5 workloads yet, so 
testing was inconclusive. We'll do further testing on this to try to 
rule in or out zfs as a contributing factor.

I'm happy to provide more technical details about this issue, but it is 
quite complex to explain and reproduce.

Thanks in advance for any help!

Best,
Matt

On 2/7/24 6:10 PM, Rick Macklem wrote:
 >
 > Well, there is certainly no obvious answer.
 > One thing I would suggest is setting up a test
 > server using UFS and see it if panics.
 >
 > To be honest, NFS is pretty simple when it comes
 > to server side reading/writiing. It basically translates
 > NFS RPCs to VOP calls on the underlying file system.
 > As such, issues are usually either network fabric on one side
 > (everything from cables to NIC drives and the TCP stack).
 > Since you are seeing this with assorted NICs and FreeBSD
 > versions, I doubt it is network fabric related, but??
 > This is why I'd suspect the ZFS side and trying UFS would
 > isolate the problem to ZFS.
 >
 > Although I know nothing about hdf5 files, the NFS code does
 > nothing with the data (it is just a byte stream),
 > Since ZFS does normally do things like compression, it could be
 > affected by the data contents.
 >
 > Another common (although less common now) problem is TSO
 > when data of certain sizes (often just less than 64K) is handled.
 > However, this usually results in hangs and I have never heard
 > of memory data corruption.
 >
 > Bottom line..determining if it is ZFS specific would be the best
 > first step, I think?
 >
 > It would be good to post this to a mailing list like freebsd-current@,
 > since others might have some insite into this. (Although you are
 > not using freebsd-current@, that is where most developers read
 > email.)
 >
 > rick
 >
 >
 > On Wed, Feb 7, 2024 at 10:50 AM Matthew L. Dailey
 >  wrote:
 >>
 >>
 >> Hi Rick,
 >>
 >> My name is Matt Dailey, and (among many other things), I run a FreeBSD
 >> file server for the Thayer School of Engineering and the Department of
 >> Computer Science here at Dartmouth College. We have run into a very odd
 >> issue in which nfs4 Linux clients using hdf5 files are corrupting memory
 >> and causing kernel panics on our FreeBSD server. The issue is very
 >> complex to describe, and despite our diligent efforts, we have not been
 >> able to replicate it in a simple scenario to report to FreeBSD
 >> developers. In advance of filing an official bug report, I’m reaching
 >> out in the hopes of having a discussion to get your guidance about how
 >> best to proceed.
 >>
 >> The quick background is that we’ve been running a FreeBSD file server,
 >> serving files from a zfs filesystem over kerberized nfs4 and samba for
 >> almost 11 years, through 3 different generations of hardware and from
 >> FreeBSD 9.1 up through 13.2. This system has historically been
 >> wonderfully stable and robust.
 >>
 >> Beginning late in 2022, and then more regularly beginning in July of
 >> 2023, we started experiencing kernel panics on our current system, then
 >> running FreeBSD 13.0. They were seemingly random (mostly trap 12 and
 >> trap 9) in random kernel functions, so we initially blamed hardware. We
 >> replaced all RAM, moved to backup hardware, and even moved to an older,
 >> retired system and the panics persisted. We have also upgraded from 13.0
 >> to 13.2 and are currently at 13.2p5.
 >>
 >> After months of investigation, we finally narrowed down that these
 >> panics were being caused by software on our Linux clients writing hdf5
 >> files over nfs to the FreeBSD server. As near as we can tell from poring
 >> through core dumps, something about how this nfs traffic is being
 >> processed is corrupting kernel memory and then eventually a panic is
 >> happening when some unsuspecting function reads the corrupted memory.
 >> Since we have eliminated most known hdf5 workloads on our production
 >> system, the panics have mostly ceased, and we suspect that the remaining
 >> crashes could be from users still using hdf5.
 >>
 >> We have reproduced this issue with both krb5 and sys mounts, and up
 >> through 13.2p9 and 14.0p4. All our testing has been using nfs 4.1.
 >> Depending on conditions, panics sometimes occur within an hour or two,
 >> or sometimes can take several days. On 14.0, the panics seems much less
 >> prevalent, but still exist. With 13.x, i