smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
This weekend I upgraded my FreeBSD laptop and kicked off a poudriere build of the packages I use. My laptop kept "freezing" during the package builds however. Initially due to messages in /var/log/messages I thought it was running out of swap and killing the display server. After poking it at off and on over the weekend I finally narrowed it down to building the devel/apr1 port, and built it on the console (rather than X) and was greeted with the following panic: panic: malloc(M_WAITOK) with sleeping prohibited cpuid = 7 time = 1639374072 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe001e5b55b0 vpanic() at vpanic+0x17f/frame 0xfe001e5b5600 panic() at panic+0x43/frame 0xfe001e5b5660 malloc_dbg() at malloc_dbg+0xd4/frame 0xfe001e5b5680 malloc() at malloc+0x2d/frame 0xfe001e5b56c0 intel_atomic_state_alloc() at intel_atomic_state_alloc+0x20/frame 0xfe001e5b56e0 drm_client_modeset_commit_atomic() at drm_client_modeset_commit_atomic+0x30/frame 0xfe001e5b5750 drm_client_modeset_commit_force() at drm_client_modeset_commit_force+0x6f/frame 0xfe001e5b5790 drm_fb_helper_restore_fbdev_mode_unlocked() at drm_fb_helper_restore_fbdev_mode_unlocked+0x82/frame 0xfe001e5b57c0 vt_kms_postswitch() at vt_kms_postswitch+0x18b/frame 0xfe001e5b57f0 vt_window_switch() at vt_window_switch+0x261/frame 0xfe001e5b5830 vtterm_cngrab() at vtterm_cngrab+0x4f/frame 0xfe001e5b5850 cngrab() at cngrab+0x26/frame 0xfe001e5b5870 vpanic() at vpanic+0xee/frame 0xfe001e5b58c0 panic() at panic+0x43/frame 0xfe001e5b5920 witness_checkorder() at witness_checkorder+0xd1c/frame 0xfe001e5b5ae0 __mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfe001e5b5b30 prison_check_ip4() at prison_check_ip4+0x51/frame 0xfe001e5b5b60 in_pcblookup_hash_locked() at in_pcblookup_hash_locked+0x2b6/frame 0xfe001e5b5bc0 in_pcblookup_mbuf() at in_pcblookup_mbuf+0x84/frame 0xfe001e5b5c00 tcp_input_with_port() at tcp_input_with_port+0x635/frame 0xfe001e5b5d50 tcp_input() at tcp_input+0xb/frame 0xfe001e5b5d60 ip_input() at ip_input+0x25e/frame 0xfe001e5b5de0 swi_net() at swi_net+0x1a1/frame 0xfe001e5b5e60 ithread_loop() at ithread_loop+0x279/frame 0xfe001e5b5ef0 fork_exit() at fork_exit+0x80/frame 0xfe001e5b5f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfe001e5b5f30 --- trap 0x61e8cb8b, rip = 0x8b4800f890ff, rsp = 0x52ff38244c8d4800, rbp = 0x245c8948ccc35f20 --- So there are two things here. The root issue is that the devel/apr1 port runs a configure test for TCP_NDELAY being inherited by accepted sockets. This test panics because prison_check_ip4() tries to lock a prison mutex to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has done an smr_enter() which is a critical_enter(): (kgdb) p panicstr $1 = 0x81ea90b0 "acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) jail mutex @ /usr/src/sys/netinet/in_jail.c:418" (kgdb) frame 39 #39 0x80dbcf71 in prison_check_ip4 (cred=, ia=ia@entry=0xfe001e5b5b90) at /usr/src/sys/netinet/in_jail.c:418 418 mtx_lock(&pr->pr_mtx); (kgdb) l 413 KASSERT(ia != NULL, ("%s: ia is NULL", __func__)); 414 415 pr = cred->cr_prison; 416 if (!(pr->pr_flags & PR_IP4)) 417 return (0); 418 mtx_lock(&pr->pr_mtx); 419 if (!(pr->pr_flags & PR_IP4)) { 420 mtx_unlock(&pr->pr_mtx); 421 return (0); 422 } (kgdb) up #41 0x80dc5cb4 in in_pcblookup_hash (pcbinfo=0xfe0022db7748, faddr=..., fport=2166892021, laddr=..., lport=0, lookupflags=, numa_domain=56 '8', ifp=) at /usr/src/sys/netinet/in_pcb.c:2387 2387inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, lport, (kgdb) l 2382struct ifnet *ifp, uint8_t numa_domain) 2383{ 2384struct inpcb *inp; 2385 2386smr_enter(pcbinfo->ipi_smr); 2387inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, lport, 2388lookupflags & INPLOOKUP_WILDCARD, ifp, numa_domain); 2389if (inp != NULL) { 2390if (__predict_false(inp_smr_lock(inp, 2391(lookupflags & INPLOOKUP_LOCKMASK)) == false)) However, it was a bit harder to see this originally as the 915kms driver tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a bad idea). When it panicked in X the result was that the screen just froze on whatever it had most recently drawn and the machine looked hung. (The fact that that sysbeep is off so I couldn't tell if typing in commands was doing anything vs emitting errors probably didn't improve trying to diagnose the hang as "sitting in ddb" initially, though I don't know if DDB itself emi
RFC: What to do about Allocate in the NFS server for FreeBSD13?
Hi, There are two problems with Allocate in the NFSv4.2 server in FreeBSD13: 1 - It uses the thread credentials instead of the ones for the RPC. 2 - It does not ensure that file changes are committed to stable storage. These problems are fixed by commit f0c9847a6c47 in main, which added ioflag and cred arguments to VOP_ALLOCATE(). I can think of 3 ways to fix Allocate in FreeBSD13: 1 - Apply a *hackish* patch like this: + savcred = p->td_ucred; + td->td_ucred = cred; do { olen = len; error = VOP_ALLOCATE(vp, &off, &len); if (error == 0 && len > 0 && olen > len) maybe_yield(); } while (error == 0 && len > 0 && olen > len); + p->td_ucred = savcred; if (error == 0 && len > 0) error = NFSERR_IO; + if (error == 0) + error = VOP_FSYNC(vp, MNT_WAIT, p); The worst part of it is temporarily setting td_ucred to cred. 2 - MFC'ng commit f0c9847a6c47. Normally changes to the VOP/VFS are not MFC'd. However, in this case, it might be ok to do so, since it is unlikely there is an out of source tree file system with a custom VOP_ALLOCATE() method? 3 - Just disable it. Currently it is disabled by default and it could just be wired down disabled. Allocate is not that useful, since ZFS does not support it. As an aside to this, the IETF NFSv4 working group seems to have agreed that NFS4ERR_NOTSUPP can be returned by a NFSv4.2 server on a 'per file system basis" instead of globally, since the Linux knfsd already does this. --> As such, Allocate can be enabled by default in main and could be enabled by default in FreeBSD13, if #1 or #2 was done. --> It still would not work for ZFS exports. So, what do you think is the preferred alternative? rick
Re: RFC: What to do about Allocate in the NFS server for FreeBSD13?
On Mon, Dec 13, 2021 at 04:26:42PM +, Rick Macklem wrote: > Hi, > > There are two problems with Allocate in the NFSv4.2 server in FreeBSD13: > 1 - It uses the thread credentials instead of the ones for the RPC. > 2 - It does not ensure that file changes are committed to stable storage. > These problems are fixed by commit f0c9847a6c47 in main, which added > ioflag and cred arguments to VOP_ALLOCATE(). > > I can think of 3 ways to fix Allocate in FreeBSD13: > 1 - Apply a *hackish* patch like this: > + savcred = p->td_ucred; > + td->td_ucred = cred; > do { > olen = len; > error = VOP_ALLOCATE(vp, &off, &len); > if (error == 0 && len > 0 && olen > len) > maybe_yield(); > } while (error == 0 && len > 0 && olen > len); > + p->td_ucred = savcred; > if (error == 0 && len > 0) > error = NFSERR_IO; > + if (error == 0) > + error = VOP_FSYNC(vp, MNT_WAIT, p); > The worst part of it is temporarily setting td_ucred to cred. > > 2 - MFC'ng commit f0c9847a6c47. Normally changes to the > VOP/VFS are not MFC'd. However, in this case, it might be > ok to do so, since it is unlikely there is an out of source tree > file system with a custom VOP_ALLOCATE() method? I do not see much wrong with #2, this is what I would do myself. > > 3 - Just disable it. Currently it is disabled by default and it > could just be wired down disabled. > Allocate is not that useful, since ZFS does not support it. > > As an aside to this, the IETF NFSv4 working group seems to > have agreed that NFS4ERR_NOTSUPP can be returned by a > NFSv4.2 server on a 'per file system basis" instead of globally, > since the Linux knfsd already does this. > --> As such, Allocate can be enabled by default in main and >could be enabled by default in FreeBSD13, if #1 or #2 was >done. >--> It still would not work for ZFS exports. > > So, what do you think is the preferred alternative? > > rick >
Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote: > ... > However, it was a bit harder to see this originally as the 915kms driver > tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which > recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a > bad idea). Funny how these new Linuxish DRM bits could affect so many things. :( > The fact that that sysbeep is off so I couldn't tell if typing in commands > was doing anything vs emitting errors probably didn't improve trying to > diagnose the hang as "sitting in ddb" initially, though I don't know if > DDB itself emits a beep for invalid commands, etc. Now that Warner had fixed the beeper frequency, why we still didn't enable it back on by default? ./danfe
Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
On Mon, 13 Dec 2021 17:01:43 + Alexey Dokuchaev wrote: > On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote: > > ... > > However, it was a bit harder to see this originally as the 915kms driver > > tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which > > recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a > > bad idea). > > Funny how these new Linuxish DRM bits could affect so many things. :( What is this comment about exactly ? > > The fact that that sysbeep is off so I couldn't tell if typing in commands > > was doing anything vs emitting errors probably didn't improve trying to > > diagnose the hang as "sitting in ddb" initially, though I don't know if > > DDB itself emits a beep for invalid commands, etc. > > Now that Warner had fixed the beeper frequency, why we still didn't enable > it back on by default? > > ./danfe > Because all people who wanted it off wasn't because it wasn't the right vt100 frequency, they just wanted it off. -- Emmanuel Vadot
Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
Hi John, On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote: J> So there are two things here. The root issue is that the devel/apr1 port J> runs a configure test for TCP_NDELAY being inherited by accepted sockets. J> This test panics because prison_check_ip4() tries to lock a prison mutex J> to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has J> done an smr_enter() which is a critical_enter(): The first one is known, and I got a patch to fix it: https://reviews.freebsd.org/D33340 However, a pre-requisite to this simple patch is more complex: https://reviews.freebsd.org/D9 There is some discussion on how to improve that, and I decided to do that rather than stick to original version. So I takes a few extra days. We could push D33340 into main, if the negative effects (raciness of the prison check) is considered lesser evil then potentially contested mtx_lock in smr section. J> However, it was a bit harder to see this originally as the 915kms driver J> tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which J> recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a J> bad idea). When it panicked in X the result was that the screen just froze J> on whatever it had most recently drawn and the machine looked hung. (The J> fact that that sysbeep is off so I couldn't tell if typing in commands was J> doing anything vs emitting errors probably didn't improve trying to diagnose J> the hang as "sitting in ddb" initially, though I don't know if DDB itself J> emits a beep for invalid commands, etc.) Didn't know about this one. Is this isolated to actually entering DDB or there is some path that in a normal inpcb lookup we would M_WAITOK? -- Gleb Smirnoff
Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
On 12/13/21 9:35 AM, Gleb Smirnoff wrote: Hi John, On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote: J> So there are two things here. The root issue is that the devel/apr1 port J> runs a configure test for TCP_NDELAY being inherited by accepted sockets. J> This test panics because prison_check_ip4() tries to lock a prison mutex J> to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has J> done an smr_enter() which is a critical_enter(): The first one is known, and I got a patch to fix it: https://reviews.freebsd.org/D33340 However, a pre-requisite to this simple patch is more complex: https://reviews.freebsd.org/D9 There is some discussion on how to improve that, and I decided to do that rather than stick to original version. So I takes a few extra days. We could push D33340 into main, if the negative effects (raciness of the prison check) is considered lesser evil then potentially contested mtx_lock in smr section. I think raciness is probably better than always panicking as it does today. J> However, it was a bit harder to see this originally as the 915kms driver J> tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which J> recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a J> bad idea). When it panicked in X the result was that the screen just froze J> on whatever it had most recently drawn and the machine looked hung. (The J> fact that that sysbeep is off so I couldn't tell if typing in commands was J> doing anything vs emitting errors probably didn't improve trying to diagnose J> the hang as "sitting in ddb" initially, though I don't know if DDB itself J> emits a beep for invalid commands, etc.) Didn't know about this one. Is this isolated to actually entering DDB or there is some path that in a normal inpcb lookup we would M_WAITOK? This is in the drm(4) driver, nothing to do with in_pcb, just made it harder to see the in_pcb issue. -- John Baldwin
Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
On Mon, Dec 13, 2021 at 11:56:35AM -0800, John Baldwin wrote: J> > J> So there are two things here. The root issue is that the devel/apr1 port J> > J> runs a configure test for TCP_NDELAY being inherited by accepted sockets. J> > J> This test panics because prison_check_ip4() tries to lock a prison mutex J> > J> to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has J> > J> done an smr_enter() which is a critical_enter(): J> > J> > The first one is known, and I got a patch to fix it: J> > J> > https://reviews.freebsd.org/D33340 J> > J> > However, a pre-requisite to this simple patch is more complex: J> > J> > https://reviews.freebsd.org/D9 J> > J> > There is some discussion on how to improve that, and I decided to do that J> > rather than stick to original version. So I takes a few extra days. J> > J> > We could push D33340 into main, if the negative effects (raciness of J> > the prison check) is considered lesser evil then potentially contested J> > mtx_lock in smr section. J> J> I think raciness is probably better than always panicking as it does today. AFAIK, today it will always panic only with WITNESS. Without WITNESS it would pass through mtx_lock as long as the mutex is not locked. So, do you suggest to push D33340 before finalizing D9? -- Gleb Smirnoff
Fixing VOP_READDIR for 64-bit directory cookies
tldr; this change allows the NFS server to export file systems that use 64-bit directory cookies https://reviews.freebsd.org/D33404 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260375 Long story: NFSv2 included a 32-bit directory cookie with each readdir entry. NFSv3 widened it to 64-bits, and FreeBSD's SVN r22521 raised VOP_READDIR's cookies argument to a u_long. That's 64-bits on some architectures, but 32-bits on others. But since the NFS server is the only caller that uses cookies, VOP_READDIR really ought to use a 64-bit type on all architectures. For NVSv2-exported file systems, the NFS server will ignore the top 32-bits of the cookie. There are no in-tree file systems that use more than 32 bits for their directory cookies, but it matters for some FUSE file systems. Does anybody have any opinions about this change, or about whether/how to MFC it? -Alan
Re: Fixing VOP_READDIR for 64-bit directory cookies
On 13.12.2021 21:47, Alan Somers wrote: > tldr; this change allows the NFS server to export file systems that > use 64-bit directory cookies > https://reviews.freebsd.org/D33404 > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260375 > > Long story: > NFSv2 included a 32-bit directory cookie with each readdir entry. > NFSv3 widened it to 64-bits, and FreeBSD's SVN r22521 raised > VOP_READDIR's cookies argument to a u_long. That's 64-bits on some > architectures, but 32-bits on others. But since the NFS server is the > only caller that uses cookies, VOP_READDIR really ought to use a > 64-bit type on all architectures. For NVSv2-exported file systems, > the NFS server will ignore the top 32-bits of the cookie. > > There are no in-tree file systems that use more than 32 bits for their > directory cookies, but it matters for some FUSE file systems. > > Does anybody have any opinions about this change, or about whether/how > to MFC it? Since changing the VOP_READDIR argument type will probably break external file system modules I'd say it is not acceptable for MFC. But merge of the NFS part only should fix the issue for the 64-bit machines, which should be good enough for the most people. -- Alexander Motin
Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore
On Mon, Dec 13, 2021 at 06:32:22PM +0100, Emmanuel Vadot wrote: > On Mon, 13 Dec 2021 17:01:43 + Alexey Dokuchaev wrote: > > On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote: > > > ... > > > However, it was a bit harder to see this originally as the 915kms driver > > > tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which > > > recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a > > > bad idea). > > > > Funny how these new Linuxish DRM bits could affect so many things. :( > > What is this comment about exactly? Oh, you know, it's about steadily deteriorating quality of our gfx stack once we had started pulling things from Linux. > > > The fact that that sysbeep is off so I couldn't tell if typing in commands > > > was doing anything vs emitting errors probably didn't improve trying to > > > diagnose the hang as "sitting in ddb" initially, though I don't know if > > > DDB itself emits a beep for invalid commands, etc. > > > > Now that Warner had fixed the beeper frequency, why we still didn't enable > > it back on by default? > > Because all people who wanted it off wasn't because it wasn't the > right vt100 frequency, they just wanted it off. In FreeBSD chat and with less biased poll than the one Warner had posted on Twitter, most people had actually voted against making it off by default: https://t.me/FreeBSD1/25129. If John's assumption that muting it pessimizes debugging is correct, it really leaves no strong reasons NOT to turn it back on by default. ./danfe