On Tue, Jul 01, 2025 at 11:45:23PM +0200, Alexander Bluhm wrote:
> On Tue, Jul 01, 2025 at 09:08:48PM +0200, Mark Kettenis wrote:
> > > Date: Tue, 1 Jul 2025 20:41:47 +0200
> > > From: Alexander Bluhm <[email protected]>
> > >
> > > Hi
> > >
> > > I see this crash on a vmd guest while running regress/sys/kern/sosplice.
> > > Note that it is a single CPU GENERIC kernel. sysctl kern.splassert=2
> > >
> > > panic: assertwaitok: non-zero mutex count: 2
> > > Stopped at db_enter+0x14: popq %rbp
> > > TID PID UID PRFLAGS PFLAGS CPU COMMAND
> > > *519542 91140 0 0x1 0 0 perl
> > > db_enter() at db_enter+0x14
> > > panic(ffffffff82595a39) at panic+0xc9
> > > assertwaitok() at assertwaitok+0x9e
> > > mi_switch() at mi_switch+0x19c
> > > pool_get(ffffffff82a28d28,1) at pool_get+0xe7
> > > uvm_mapent_alloc(ffffffff82b0eb60,8) at uvm_mapent_alloc+0x2b2
> > > uvm_map_mkentry(ffffffff82b0eb60,fffffd8006e6cbd0,fffffd8006e6cbd0,ffff80002a32
> > > 0000,1000,8,79bcd127adccfb5a,7) at uvm_map_mkentry+0x63
> > > uvm_mapent_clone(ffffffff82b0eb60,ffff80002a320000,1000,0,1,7,a33acdf397a7ed83,
> > > fffffd806c1f89e8,fffffd806e3beb40,c) at uvm_mapent_clone+0x92
> > > uvm_map_extract(fffffd806e3beb40,83d6d1f7000,1000,ffff80002a39f048,8) at
> > > uvm_ma
> > > p_extract+0x309
> > > sys_kbind(ffff80002a294020,ffff80002a39f160,ffff80002a39f0d0) at
> > > sys_kbind+0x3a
> > > 1
> > > syscall(ffff80002a39f160) at syscall+0x444
> > > Xsyscall() at Xsyscall+0x128
> > > end of kernel
> > > end trace frame: 0x783818799758, count: 3
> > > https://www.openbsd.org/ddb.html describes the minimum info required in
> > > bug
> > > reports. Insufficient info makes it difficult to find and fix bugs.
> >
> > I don't see anything in that codepath to would end up there with a
> > mutex held. So my guess is you somehow returned to userland with a
> > mutex held because of a missing mtx_leave() call in an error path. Or
> > maybe an interrupt handler that forgot to unlock a mutex?
>
> That makes sense. I also get the same panic with the same test
> but different stacktrace.
>
> panic: assertwaitok: non-zero mutex count: 2
> Stopped at db_enter+0x14: popq %rbp
> TID PID UID PRFLAGS PFLAGS CPU COMMAND
> *184775 13589 0 0x1 0 0 perl
> db_enter() at db_enter+0x14
> panic(ffffffff82595a39) at panic+0xc9
> assertwaitok() at assertwaitok+0x9e
> mi_switch() at mi_switch+0x19c
> pool_get(ffffffff82b1fb10,1) at pool_get+0xe7
> m_split(fffffd806964f900,9,1) at m_split+0xa9
> somove(ffff800000b0d6f8,1) at somove+0xb2a
> sosplice(ffff800000b0d6f8,1,3d,fffffd8006e71430) at sosplice+0x513
> sys_setsockopt(ffff80002a2a1498,ffff80002a3995e0,ffff80002a399550) at
> sys_setso
> ckopt+0x169
> syscall(ffff80002a3995e0) at syscall+0x444
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x76a44e747000, count: 4
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports. Insufficient info makes it difficult to find and fix bugs.
>
> Stangely it only happens with GENERIC kernel, but not with WITNESS.
>
> bluhm
>
I think these panics are different.
The "while (((rcvstate & SS_RCVATMARK).." loop of somove() also has two
m_get(wait, ...) calls, which must be moved outside of mutex(9) section
too. The loop operates with local data so it is possible.
Index: sys/kern/uipc_socket.c
===================================================================
RCS file: /cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.378
diff -u -p -r1.378 uipc_socket.c
--- sys/kern/uipc_socket.c 23 May 2025 23:41:46 -0000 1.378
+++ sys/kern/uipc_socket.c 1 Jul 2025 22:21:36 -0000
@@ -1763,21 +1763,22 @@ somove(struct socket *so, int wait)
(so->so_options & SO_OOBINLINE)) {
struct mbuf *o = NULL;
+ mtx_leave(&sosp->so_snd.sb_mtx);
+ mtx_leave(&so->so_rcv.sb_mtx);
+
if (rcvstate & SS_RCVATMARK) {
o = m_get(wait, MT_DATA);
rcvstate &= ~SS_RCVATMARK;
} else if (oobmark) {
o = m_split(m, oobmark, wait);
if (o) {
- mtx_leave(&sosp->so_snd.sb_mtx);
- mtx_leave(&so->so_rcv.sb_mtx);
solock_shared(sosp);
error = pru_send(sosp, m, NULL, NULL);
sounlock_shared(sosp);
- mtx_enter(&so->so_rcv.sb_mtx);
- mtx_enter(&sosp->so_snd.sb_mtx);
if (error) {
+ mtx_enter(&so->so_rcv.sb_mtx);
+ mtx_enter(&sosp->so_snd.sb_mtx);
if (sosp->so_snd.sb_state &
SS_CANTSENDMORE)
error = EPIPE;
@@ -1795,15 +1796,13 @@ somove(struct socket *so, int wait)
o->m_len = 1;
*mtod(o, caddr_t) = *mtod(m, caddr_t);
- mtx_leave(&sosp->so_snd.sb_mtx);
- mtx_leave(&so->so_rcv.sb_mtx);
solock_shared(sosp);
error = pru_sendoob(sosp, o, NULL, NULL);
sounlock_shared(sosp);
- mtx_enter(&so->so_rcv.sb_mtx);
- mtx_enter(&sosp->so_snd.sb_mtx);
if (error) {
+ mtx_enter(&so->so_rcv.sb_mtx);
+ mtx_enter(&sosp->so_snd.sb_mtx);
if (sosp->so_snd.sb_state & SS_CANTSENDMORE)
error = EPIPE;
m_freem(m);
@@ -1818,6 +1817,9 @@ somove(struct socket *so, int wait)
}
m_adj(m, 1);
}
+
+ mtx_enter(&so->so_rcv.sb_mtx);
+ mtx_enter(&sosp->so_snd.sb_mtx);
}
/* Append all remaining data to drain socket. */