smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread John Baldwin

This weekend I upgraded my FreeBSD laptop and kicked off a poudriere build of
the packages I use.  My laptop kept "freezing" during the package builds 
however.
Initially due to messages in /var/log/messages I thought it was running out of
swap and killing the display server.  After poking it at off and on over the
weekend I finally narrowed it down to building the devel/apr1 port, and built
it on the console (rather than X) and was greeted with the following panic:

panic: malloc(M_WAITOK) with sleeping prohibited
cpuid = 7
time = 1639374072
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe001e5b55b0
vpanic() at vpanic+0x17f/frame 0xfe001e5b5600
panic() at panic+0x43/frame 0xfe001e5b5660
malloc_dbg() at malloc_dbg+0xd4/frame 0xfe001e5b5680
malloc() at malloc+0x2d/frame 0xfe001e5b56c0
intel_atomic_state_alloc() at intel_atomic_state_alloc+0x20/frame 
0xfe001e5b56e0
drm_client_modeset_commit_atomic() at 
drm_client_modeset_commit_atomic+0x30/frame 0xfe001e5b5750
drm_client_modeset_commit_force() at drm_client_modeset_commit_force+0x6f/frame 
0xfe001e5b5790
drm_fb_helper_restore_fbdev_mode_unlocked() at 
drm_fb_helper_restore_fbdev_mode_unlocked+0x82/frame 0xfe001e5b57c0
vt_kms_postswitch() at vt_kms_postswitch+0x18b/frame 0xfe001e5b57f0
vt_window_switch() at vt_window_switch+0x261/frame 0xfe001e5b5830
vtterm_cngrab() at vtterm_cngrab+0x4f/frame 0xfe001e5b5850
cngrab() at cngrab+0x26/frame 0xfe001e5b5870
vpanic() at vpanic+0xee/frame 0xfe001e5b58c0
panic() at panic+0x43/frame 0xfe001e5b5920
witness_checkorder() at witness_checkorder+0xd1c/frame 0xfe001e5b5ae0
__mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfe001e5b5b30
prison_check_ip4() at prison_check_ip4+0x51/frame 0xfe001e5b5b60
in_pcblookup_hash_locked() at in_pcblookup_hash_locked+0x2b6/frame 
0xfe001e5b5bc0
in_pcblookup_mbuf() at in_pcblookup_mbuf+0x84/frame 0xfe001e5b5c00
tcp_input_with_port() at tcp_input_with_port+0x635/frame 0xfe001e5b5d50
tcp_input() at tcp_input+0xb/frame 0xfe001e5b5d60
ip_input() at ip_input+0x25e/frame 0xfe001e5b5de0
swi_net() at swi_net+0x1a1/frame 0xfe001e5b5e60
ithread_loop() at ithread_loop+0x279/frame 0xfe001e5b5ef0
fork_exit() at fork_exit+0x80/frame 0xfe001e5b5f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe001e5b5f30
--- trap 0x61e8cb8b, rip = 0x8b4800f890ff, rsp = 0x52ff38244c8d4800, rbp = 
0x245c8948ccc35f20 ---

So there are two things here.  The root issue is that the devel/apr1 port
runs a configure test for TCP_NDELAY being inherited by accepted sockets.
This test panics because prison_check_ip4() tries to lock a prison mutex
to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has
done an smr_enter() which is a critical_enter():

(kgdb) p panicstr
$1 = 0x81ea90b0  "acquiring blockable sleep lock with spinlock 
or critical section held (sleep mutex) jail mutex @ /usr/src/sys/netinet/in_jail.c:418"
(kgdb) frame 39
#39 0x80dbcf71 in prison_check_ip4 (cred=,
ia=ia@entry=0xfe001e5b5b90) at /usr/src/sys/netinet/in_jail.c:418
418 mtx_lock(&pr->pr_mtx);
(kgdb) l
413 KASSERT(ia != NULL, ("%s: ia is NULL", __func__));
414 
415 pr = cred->cr_prison;
416 if (!(pr->pr_flags & PR_IP4))
417 return (0);
418 mtx_lock(&pr->pr_mtx);
419 if (!(pr->pr_flags & PR_IP4)) {
420 mtx_unlock(&pr->pr_mtx);
421 return (0);
422 }
(kgdb) up
#41 0x80dc5cb4 in in_pcblookup_hash (pcbinfo=0xfe0022db7748,
faddr=..., fport=2166892021, laddr=..., lport=0,
lookupflags=, numa_domain=56 '8', ifp=)
at /usr/src/sys/netinet/in_pcb.c:2387
2387inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, 
lport,
(kgdb) l
2382struct ifnet *ifp, uint8_t numa_domain)
2383{
2384struct inpcb *inp;
2385
2386smr_enter(pcbinfo->ipi_smr);
2387inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, 
lport,
2388lookupflags & INPLOOKUP_WILDCARD, ifp, numa_domain);
2389if (inp != NULL) {
2390if (__predict_false(inp_smr_lock(inp,
2391(lookupflags & INPLOOKUP_LOCKMASK)) == false))

However, it was a bit harder to see this originally as the 915kms driver
tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which
recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a
bad idea).  When it panicked in X the result was that the screen just froze
on whatever it had most recently drawn and the machine looked hung.  (The
fact that that sysbeep is off so I couldn't tell if typing in commands was
doing anything vs emitting errors probably didn't improve trying to diagnose
the hang as "sitting in ddb" initially, though I don't know if DDB itself
emi

RFC: What to do about Allocate in the NFS server for FreeBSD13?

2021-12-13 Thread Rick Macklem
Hi,

There are two problems with Allocate in the NFSv4.2 server in FreeBSD13:
1 - It uses the thread credentials instead of the ones for the RPC.
2 - It does not ensure that file changes are committed to stable storage.
These problems are fixed by commit f0c9847a6c47 in main, which added
ioflag and cred arguments to VOP_ALLOCATE().

I can think of 3 ways to fix Allocate in FreeBSD13:
1 - Apply a *hackish* patch like this:
+   savcred = p->td_ucred;
+   td->td_ucred = cred;
do {
olen = len;
error = VOP_ALLOCATE(vp, &off, &len);
if (error == 0 && len > 0 && olen > len)
maybe_yield();
} while (error == 0 && len > 0 && olen > len);
+   p->td_ucred = savcred;
if (error == 0 && len > 0)
error = NFSERR_IO;
+   if (error == 0)
+   error = VOP_FSYNC(vp, MNT_WAIT, p);
The worst part of it is temporarily setting td_ucred to cred.

2 - MFC'ng commit f0c9847a6c47. Normally changes to the
 VOP/VFS are not MFC'd. However, in this case, it might be
 ok to do so, since it is unlikely there is an out of source tree
 file system with a custom VOP_ALLOCATE() method?

3 - Just disable it. Currently it is disabled by default and it
 could just be wired down disabled.
 Allocate is not that useful, since ZFS does not support it.

As an aside to this, the IETF NFSv4 working group seems to
have agreed that NFS4ERR_NOTSUPP can be returned by a
NFSv4.2 server on a 'per file system basis" instead of globally,
since the Linux knfsd already does this.
--> As such, Allocate can be enabled by default in main and
   could be enabled by default in FreeBSD13, if #1 or #2 was
   done.
   --> It still would not work for ZFS exports.

So, what do you think is the preferred alternative?

rick



Re: RFC: What to do about Allocate in the NFS server for FreeBSD13?

2021-12-13 Thread Konstantin Belousov
On Mon, Dec 13, 2021 at 04:26:42PM +, Rick Macklem wrote:
> Hi,
> 
> There are two problems with Allocate in the NFSv4.2 server in FreeBSD13:
> 1 - It uses the thread credentials instead of the ones for the RPC.
> 2 - It does not ensure that file changes are committed to stable storage.
> These problems are fixed by commit f0c9847a6c47 in main, which added
> ioflag and cred arguments to VOP_ALLOCATE().
> 
> I can think of 3 ways to fix Allocate in FreeBSD13:
> 1 - Apply a *hackish* patch like this:
> + savcred = p->td_ucred;
> + td->td_ucred = cred;
>   do {
>   olen = len;
>   error = VOP_ALLOCATE(vp, &off, &len);
>   if (error == 0 && len > 0 && olen > len)
>   maybe_yield();
>   } while (error == 0 && len > 0 && olen > len);
> + p->td_ucred = savcred;
>   if (error == 0 && len > 0)
>   error = NFSERR_IO;
> + if (error == 0)
> + error = VOP_FSYNC(vp, MNT_WAIT, p);
> The worst part of it is temporarily setting td_ucred to cred.
> 
> 2 - MFC'ng commit f0c9847a6c47. Normally changes to the
>  VOP/VFS are not MFC'd. However, in this case, it might be
>  ok to do so, since it is unlikely there is an out of source tree
>  file system with a custom VOP_ALLOCATE() method?
I do not see much wrong with #2, this is what I would do myself.

> 
> 3 - Just disable it. Currently it is disabled by default and it
>  could just be wired down disabled.
>  Allocate is not that useful, since ZFS does not support it.
> 
> As an aside to this, the IETF NFSv4 working group seems to
> have agreed that NFS4ERR_NOTSUPP can be returned by a
> NFSv4.2 server on a 'per file system basis" instead of globally,
> since the Linux knfsd already does this.
> --> As such, Allocate can be enabled by default in main and
>could be enabled by default in FreeBSD13, if #1 or #2 was
>done.
>--> It still would not work for ZFS exports.
> 
> So, what do you think is the preferred alternative?
> 
> rick
> 



Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread Alexey Dokuchaev
On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote:
> ...
> However, it was a bit harder to see this originally as the 915kms driver
> tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which
> recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a
> bad idea).

Funny how these new Linuxish DRM bits could affect so many things. :(

> The fact that that sysbeep is off so I couldn't tell if typing in commands
> was doing anything vs emitting errors probably didn't improve trying to
> diagnose the hang as "sitting in ddb" initially, though I don't know if
> DDB itself emits a beep for invalid commands, etc.

Now that Warner had fixed the beeper frequency, why we still didn't enable
it back on by default?

./danfe



Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread Emmanuel Vadot
On Mon, 13 Dec 2021 17:01:43 +
Alexey Dokuchaev  wrote:

> On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote:
> > ...
> > However, it was a bit harder to see this originally as the 915kms driver
> > tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which
> > recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a
> > bad idea).
> 
> Funny how these new Linuxish DRM bits could affect so many things. :(

 What is this comment about exactly ?

> > The fact that that sysbeep is off so I couldn't tell if typing in commands
> > was doing anything vs emitting errors probably didn't improve trying to
> > diagnose the hang as "sitting in ddb" initially, though I don't know if
> > DDB itself emits a beep for invalid commands, etc.
> 
> Now that Warner had fixed the beeper frequency, why we still didn't enable
> it back on by default?
> 
> ./danfe
> 

 Because all people who wanted it off wasn't because it wasn't the
right vt100 frequency, they just wanted it off.

-- 
Emmanuel Vadot  



Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread Gleb Smirnoff
  Hi John,

On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote:
J> So there are two things here.  The root issue is that the devel/apr1 port
J> runs a configure test for TCP_NDELAY being inherited by accepted sockets.
J> This test panics because prison_check_ip4() tries to lock a prison mutex
J> to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has
J> done an smr_enter() which is a critical_enter():

The first one is known, and I got a patch to fix it:

https://reviews.freebsd.org/D33340

However, a pre-requisite to this simple patch is more complex:

https://reviews.freebsd.org/D9

There is some discussion on how to improve that, and I decided to do that
rather than stick to original version. So I takes a few extra days.

We could push D33340 into main, if the negative effects (raciness of
the prison check) is considered lesser evil then potentially contested
mtx_lock in smr section.

J> However, it was a bit harder to see this originally as the 915kms driver
J> tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which
J> recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a
J> bad idea).  When it panicked in X the result was that the screen just froze
J> on whatever it had most recently drawn and the machine looked hung.  (The
J> fact that that sysbeep is off so I couldn't tell if typing in commands was
J> doing anything vs emitting errors probably didn't improve trying to diagnose
J> the hang as "sitting in ddb" initially, though I don't know if DDB itself
J> emits a beep for invalid commands, etc.)

Didn't know about this one. Is this isolated to actually entering DDB or
there is some path that in a normal inpcb lookup we would M_WAITOK?

-- 
Gleb Smirnoff



Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread John Baldwin

On 12/13/21 9:35 AM, Gleb Smirnoff wrote:

   Hi John,

On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote:
J> So there are two things here.  The root issue is that the devel/apr1 port
J> runs a configure test for TCP_NDELAY being inherited by accepted sockets.
J> This test panics because prison_check_ip4() tries to lock a prison mutex
J> to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has
J> done an smr_enter() which is a critical_enter():

The first one is known, and I got a patch to fix it:

https://reviews.freebsd.org/D33340

However, a pre-requisite to this simple patch is more complex:

https://reviews.freebsd.org/D9

There is some discussion on how to improve that, and I decided to do that
rather than stick to original version. So I takes a few extra days.

We could push D33340 into main, if the negative effects (raciness of
the prison check) is considered lesser evil then potentially contested
mtx_lock in smr section.


I think raciness is probably better than always panicking as it does today.
 

J> However, it was a bit harder to see this originally as the 915kms driver
J> tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which
J> recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a
J> bad idea).  When it panicked in X the result was that the screen just froze
J> on whatever it had most recently drawn and the machine looked hung.  (The
J> fact that that sysbeep is off so I couldn't tell if typing in commands was
J> doing anything vs emitting errors probably didn't improve trying to diagnose
J> the hang as "sitting in ddb" initially, though I don't know if DDB itself
J> emits a beep for invalid commands, etc.)

Didn't know about this one. Is this isolated to actually entering DDB or
there is some path that in a normal inpcb lookup we would M_WAITOK?


This is in the drm(4) driver, nothing to do with in_pcb, just made it harder
to see the in_pcb issue.

--
John Baldwin



Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread Gleb Smirnoff
On Mon, Dec 13, 2021 at 11:56:35AM -0800, John Baldwin wrote:
J> > J> So there are two things here.  The root issue is that the devel/apr1 
port
J> > J> runs a configure test for TCP_NDELAY being inherited by accepted 
sockets.
J> > J> This test panics because prison_check_ip4() tries to lock a prison mutex
J> > J> to walk the IPs assigned to a jail, but the caller 
(in_pcblookup_hash()) has
J> > J> done an smr_enter() which is a critical_enter():
J> > 
J> > The first one is known, and I got a patch to fix it:
J> > 
J> > https://reviews.freebsd.org/D33340
J> > 
J> > However, a pre-requisite to this simple patch is more complex:
J> > 
J> > https://reviews.freebsd.org/D9
J> > 
J> > There is some discussion on how to improve that, and I decided to do that
J> > rather than stick to original version. So I takes a few extra days.
J> > 
J> > We could push D33340 into main, if the negative effects (raciness of
J> > the prison check) is considered lesser evil then potentially contested
J> > mtx_lock in smr section.
J> 
J> I think raciness is probably better than always panicking as it does today.

AFAIK, today it will always panic only with WITNESS. Without WITNESS it would
pass through mtx_lock as long as the mutex is not locked.

So, do you suggest to push D33340 before finalizing D9?

-- 
Gleb Smirnoff



Fixing VOP_READDIR for 64-bit directory cookies

2021-12-13 Thread Alan Somers
tldr; this change allows the NFS server to export file systems that
use 64-bit directory cookies
https://reviews.freebsd.org/D33404
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260375

Long story:
NFSv2 included a 32-bit directory cookie with each readdir entry.
NFSv3 widened it to 64-bits, and FreeBSD's SVN r22521 raised
VOP_READDIR's cookies argument to a u_long.  That's 64-bits on some
architectures, but 32-bits on others.  But since the NFS server is the
only caller that uses cookies, VOP_READDIR really ought to use a
64-bit type on all architectures.  For NVSv2-exported file systems,
the NFS server will ignore the top 32-bits of the cookie.

There are no in-tree file systems that use more than 32 bits for their
directory cookies, but it matters for some FUSE file systems.

Does anybody have any opinions about this change, or about whether/how
to MFC it?

-Alan



Re: Fixing VOP_READDIR for 64-bit directory cookies

2021-12-13 Thread Alexander Motin
On 13.12.2021 21:47, Alan Somers wrote:
> tldr; this change allows the NFS server to export file systems that
> use 64-bit directory cookies
> https://reviews.freebsd.org/D33404
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260375
> 
> Long story:
> NFSv2 included a 32-bit directory cookie with each readdir entry.
> NFSv3 widened it to 64-bits, and FreeBSD's SVN r22521 raised
> VOP_READDIR's cookies argument to a u_long.  That's 64-bits on some
> architectures, but 32-bits on others.  But since the NFS server is the
> only caller that uses cookies, VOP_READDIR really ought to use a
> 64-bit type on all architectures.  For NVSv2-exported file systems,
> the NFS server will ignore the top 32-bits of the cookie.
> 
> There are no in-tree file systems that use more than 32 bits for their
> directory cookies, but it matters for some FUSE file systems.
> 
> Does anybody have any opinions about this change, or about whether/how
> to MFC it?

Since changing the VOP_READDIR argument type will probably break
external file system modules I'd say it is not acceptable for MFC.  But
merge of the NFS part only should fix the issue for the 64-bit machines,
which should be good enough for the most people.

-- 
Alexander Motin



Re: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore

2021-12-13 Thread Alexey Dokuchaev
On Mon, Dec 13, 2021 at 06:32:22PM +0100, Emmanuel Vadot wrote:
> On Mon, 13 Dec 2021 17:01:43 + Alexey Dokuchaev wrote:
> > On Mon, Dec 13, 2021 at 07:45:07AM -0800, John Baldwin wrote:
> > > ...
> > > However, it was a bit harder to see this originally as the 915kms driver
> > > tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which
> > > recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a
> > > bad idea).
> > 
> > Funny how these new Linuxish DRM bits could affect so many things. :(
> 
> What is this comment about exactly?

Oh, you know, it's about steadily deteriorating quality of our gfx stack
once we had started pulling things from Linux.

> > > The fact that that sysbeep is off so I couldn't tell if typing in commands
> > > was doing anything vs emitting errors probably didn't improve trying to
> > > diagnose the hang as "sitting in ddb" initially, though I don't know if
> > > DDB itself emits a beep for invalid commands, etc.
> > 
> > Now that Warner had fixed the beeper frequency, why we still didn't enable
> > it back on by default?
> 
> Because all people who wanted it off wasn't because it wasn't the
> right vt100 frequency, they just wanted it off.

In FreeBSD chat and with less biased poll than the one Warner had posted
on Twitter, most people had actually voted against making it off by
default: https://t.me/FreeBSD1/25129.  If John's assumption that muting
it pessimizes debugging is correct, it really leaves no strong reasons
NOT to turn it back on by default.

./danfe