On Sun, Jun 05, 2022 at 12:27:32PM +0200, Martin Pieuchot wrote: > On 05/06/22(Sun) 05:20, Visa Hankala wrote: > > Encountered the following panic: > > > > panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: > > file "/usr/src/sys/kern/kern_synch.c", line 373 > > Stopped at db_enter+0x10: popq %rbp > > TID PID UID PRFLAGS PFLAGS CPU COMMAND > > 423109 57118 55 0x3 0 2 link > > 330695 30276 55 0x3 0 3 link > > * 46366 85501 55 0x1003 0x4080400 1 link > > 188803 85501 55 0x1003 0x4082000 0K link > > db_enter() at db_enter+0x10 > > panic(ffffffff81f25d2b) at panic+0xbf > > __assert(ffffffff81f9a186,ffffffff81f372c8,175,ffffffff81f87c6c) at > > __assert+0x25 > > sleep_setup(ffff800022d64bf8,ffff800022d64c98,20,ffffffff81f66ac6,0) at > > sleep_setup+0x1d8 > > cond_wait(ffff800022d64c98,ffffffff81f66ac6) at cond_wait+0x46 > > timeout_barrier(ffff8000228a28b0) at timeout_barrier+0x109 > > timeout_del_barrier(ffff8000228a28b0) at timeout_del_barrier+0xa2 > > sleep_finish(ffff800022d64d90,1) at sleep_finish+0x16d > > tsleep(ffffffff823a5130,120,ffffffff81f0b730,2) at tsleep+0xb2 > > sys_nanosleep(ffff8000228a27f0,ffff800022d64ea0,ffff800022d64ef0) at > > sys_nanosleep+0x12d > > syscall(ffff800022d64f60) at syscall+0x374 > > > > The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, > > soft-interrupt-driven timeouts could be deleted synchronously without > > blocking. Now, timeout_del_barrier() can sleep regardless of the type > > of the timeout. > > > > It looks that with small adjustments timeout_del_barrier() can sleep > > in sleep_finish(). The management of run queues is not affected because > > the timeout clearing happens after it. As timeout_del_barrier() does not > > rely on a timeout or signal catching, there should be no risk of > > unbounded recursion or unwanted signal side effects within the sleep > > machinery. In a way, a sleep with a timeout is higher-level than > > one without. > > I trust you on the analysis. However this looks very fragile to me. > > The use of timeout_del_barrier() which can sleep using the global sleep > queue is worrying me.
I think the queue handling ends in sleep_finish() when SCHED_LOCK() is released. The timeout clearing is done outside of it. The extra sleeping point inside sleep_finish() is subtle. It should not matter in typical use. But is it permissible with the API? Also, if timeout_del_barrier() sleeps, the thread's priority can change. Note that sleep_finish() already can take an additional nap when signal catching is enabled. > > Note that endtsleep() can run and set P_TIMEOUT during > > timeout_del_barrier() when the thread is blocked in cond_wait(). > > To avoid unnecessary atomic read-modify-write operations, the clearing > > of P_TIMEOUT could be conditional, but maybe that is an unnecessary > > optimization at this point. > > I agree this optimization seems unnecessary at the moment. > > > While it should be possible to make the code use timeout_del() instead > > of timeout_del_barrier(), the outcome might not be outright better. For > > example, sleep_setup() and endtsleep() would have to coordinate so that > > a late-running timeout from previous sleep cycle would not disturb the > > new cycle. > > So that's the price for not having to sleep in sleep_finish(), right? That is correct. Some synchronization is needed in any case. > > To test the barrier path reliably, I made the code call > > timeout_del_barrier() twice in a row. The second call is guaranteed > > to sleep. Of course, this is not part of the patch. > > ok mpi@ > > > Index: kern/kern_synch.c > > =================================================================== > > RCS file: src/sys/kern/kern_synch.c,v > > retrieving revision 1.187 > > diff -u -p -r1.187 kern_synch.c > > --- kern/kern_synch.c 13 May 2022 15:32:00 -0000 1.187 > > +++ kern/kern_synch.c 5 Jun 2022 05:04:45 -0000 > > @@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con > > p->p_slppri = prio & PRIMASK; > > TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_runq); > > > > - KASSERT((p->p_flag & P_TIMEOUT) == 0); > > if (timo) { > > + KASSERT((p->p_flag & P_TIMEOUT) == 0); > > sls->sls_timeout = 1; > > timeout_add(&p->p_sleep_to, timo); > > } > > @@ -432,13 +432,12 @@ sleep_finish(struct sleep_state *sls, in > > > > if (sls->sls_timeout) { > > if (p->p_flag & P_TIMEOUT) { > > - atomic_clearbits_int(&p->p_flag, P_TIMEOUT); > > error1 = EWOULDBLOCK; > > } else { > > - /* This must not sleep. */ > > + /* This can sleep. It must not use timeouts. */ > > timeout_del_barrier(&p->p_sleep_to); > > - KASSERT((p->p_flag & P_TIMEOUT) == 0); > > } > > + atomic_clearbits_int(&p->p_flag, P_TIMEOUT); > > } > > > > /* Check if thread was woken up because of a unwind or signal */ > > >