Re: Fix clearing of sleep timeouts
On Mon, Jun 06, 2022 at 06:47:32AM +1000, David Gwynne wrote: > On Sun, Jun 05, 2022 at 03:57:39PM +, Visa Hankala wrote: > > On Sun, Jun 05, 2022 at 12:27:32PM +0200, Martin Pieuchot wrote: > > > On 05/06/22(Sun) 05:20, Visa Hankala wrote: > > > > Encountered the following panic: > > > > > > > > panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" > > > > failed: file "/usr/src/sys/kern/kern_synch.c", line 373 > > > > Stopped at db_enter+0x10: popq%rbp > > > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > > > 423109 57118 55 0x3 02 link > > > > 330695 30276 55 0x3 03 link > > > > * 46366 85501 55 0x1003 0x40804001 link > > > > 188803 85501 55 0x1003 0x40820000K link > > > > db_enter() at db_enter+0x10 > > > > panic(81f25d2b) at panic+0xbf > > > > __assert(81f9a186,81f372c8,175,81f87c6c) at > > > > __assert+0x25 > > > > sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at > > > > sleep_setup+0x1d8 > > > > cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46 > > > > timeout_barrier(8000228a28b0) at timeout_barrier+0x109 > > > > timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2 > > > > sleep_finish(800022d64d90,1) at sleep_finish+0x16d > > > > tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2 > > > > sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at > > > > sys_nanosleep+0x12d > > > > syscall(800022d64f60) at syscall+0x374 > > > > > > > > The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, > > > > soft-interrupt-driven timeouts could be deleted synchronously without > > > > blocking. Now, timeout_del_barrier() can sleep regardless of the type > > > > of the timeout. > > > > > > > > It looks that with small adjustments timeout_del_barrier() can sleep > > > > in sleep_finish(). The management of run queues is not affected because > > > > the timeout clearing happens after it. As timeout_del_barrier() does not > > > > rely on a timeout or signal catching, there should be no risk of > > > > unbounded recursion or unwanted signal side effects within the sleep > > > > machinery. In a way, a sleep with a timeout is higher-level than > > > > one without. > > > > > > I trust you on the analysis. However this looks very fragile to me. > > > > > > The use of timeout_del_barrier() which can sleep using the global sleep > > > queue is worrying me. > > > > I think the queue handling ends in sleep_finish() when SCHED_LOCK() > > is released. The timeout clearing is done outside of it. > > That's ok. > > > The extra sleeping point inside sleep_finish() is subtle. It should not > > matter in typical use. But is it permissible with the API? Also, if > > timeout_del_barrier() sleeps, the thread's priority can change. > > What other options do we have at this point? Spin? Allocate the timeout > dynamically so sleep_finish doesn't have to wait for it and let the > handler clean up? How would you stop the timeout handler waking up the > thread if it's gone back to sleep again for some other reason? In principle, each thread could have a sleep serial number. If the serial number was somehow associated with the timeout, the handler could bail out if the thread has moved on. However, implementing that association looks tricky. Dynamic allocation could provide a suitable context struct, but memory allocation in this code would be awful. Maybe there should be an extended version of timeout_add() that allows controlled updating of the timeout argument.. Does not sound appealing. Spinning might be an option in the future. The kernel lock complicates things, however. The spinning wait would have to release the kernel lock to avoid possible deadlocking. > Sleeping here is the least worst option. So it seems. > As for timeout_del_barrier, if prio is a worry we can provide an > advanced version of it that lets you pass the prio in. I'd also > like to change timeout_barrier so it queues the barrier task at the > head of the pending lists rather than at the tail. I think it is not important to preserve the priority here. Later scheduling events will override it anyway. > > > Note that sleep_finish() already can take an additional nap when > > signal catching is enabled. > > > > > > Note that endtsleep() can run and set P_TIMEOUT during > > > > timeout_del_barrier() when the thread is blocked in cond_wait(). > > > > To avoid unnecessary atomic read-modify-write operations, the clearing > > > > of P_TIMEOUT could be conditional, but maybe that is an unnecessary > > > > optimization at this point. > > > > > > I agree this optimization seems unnecessary at the moment. > > > > > > > While it should be possible to make the code use timeout_del() instead > > > > of timeout_del_barrier(), the outcome might not be outright better. For > > > > example, sleep_
Re: Fix clearing of sleep timeouts
On 06/06/22(Mon) 06:47, David Gwynne wrote: > On Sun, Jun 05, 2022 at 03:57:39PM +, Visa Hankala wrote: > > On Sun, Jun 05, 2022 at 12:27:32PM +0200, Martin Pieuchot wrote: > > > On 05/06/22(Sun) 05:20, Visa Hankala wrote: > > > > Encountered the following panic: > > > > > > > > panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" > > > > failed: file "/usr/src/sys/kern/kern_synch.c", line 373 > > > > Stopped at db_enter+0x10: popq%rbp > > > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > > > 423109 57118 55 0x3 02 link > > > > 330695 30276 55 0x3 03 link > > > > * 46366 85501 55 0x1003 0x40804001 link > > > > 188803 85501 55 0x1003 0x40820000K link > > > > db_enter() at db_enter+0x10 > > > > panic(81f25d2b) at panic+0xbf > > > > __assert(81f9a186,81f372c8,175,81f87c6c) at > > > > __assert+0x25 > > > > sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at > > > > sleep_setup+0x1d8 > > > > cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46 > > > > timeout_barrier(8000228a28b0) at timeout_barrier+0x109 > > > > timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2 > > > > sleep_finish(800022d64d90,1) at sleep_finish+0x16d > > > > tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2 > > > > sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at > > > > sys_nanosleep+0x12d > > > > syscall(800022d64f60) at syscall+0x374 > > > > > > > > The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, > > > > soft-interrupt-driven timeouts could be deleted synchronously without > > > > blocking. Now, timeout_del_barrier() can sleep regardless of the type > > > > of the timeout. > > > > > > > > It looks that with small adjustments timeout_del_barrier() can sleep > > > > in sleep_finish(). The management of run queues is not affected because > > > > the timeout clearing happens after it. As timeout_del_barrier() does not > > > > rely on a timeout or signal catching, there should be no risk of > > > > unbounded recursion or unwanted signal side effects within the sleep > > > > machinery. In a way, a sleep with a timeout is higher-level than > > > > one without. > > > > > > I trust you on the analysis. However this looks very fragile to me. > > > > > > The use of timeout_del_barrier() which can sleep using the global sleep > > > queue is worrying me. > > > > I think the queue handling ends in sleep_finish() when SCHED_LOCK() > > is released. The timeout clearing is done outside of it. > > That's ok. > > > The extra sleeping point inside sleep_finish() is subtle. It should not > > matter in typical use. But is it permissible with the API? Also, if > > timeout_del_barrier() sleeps, the thread's priority can change. > > What other options do we have at this point? Spin? Allocate the timeout > dynamically so sleep_finish doesn't have to wait for it and let the > handler clean up? How would you stop the timeout handler waking up the > thread if it's gone back to sleep again for some other reason? > > Sleeping here is the least worst option. I agree. I don't think sleeping is bad here. My concern is about how sleeping is implemented. There's a single API built on top of a single global data structure which now calls itself recursively. I'm not sure how much work it would be to make cond_wait(9) use its own sleep queue... This is something independent from this fix though. > As for timeout_del_barrier, if prio is a worry we can provide an > advanced version of it that lets you pass the prio in. I'd also > like to change timeout_barrier so it queues the barrier task at the > head of the pending lists rather than at the tail. I doubt prio matter.
Re: Fix clearing of sleep timeouts
On Sun, Jun 05, 2022 at 03:57:39PM +, Visa Hankala wrote: > On Sun, Jun 05, 2022 at 12:27:32PM +0200, Martin Pieuchot wrote: > > On 05/06/22(Sun) 05:20, Visa Hankala wrote: > > > Encountered the following panic: > > > > > > panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: > > > file "/usr/src/sys/kern/kern_synch.c", line 373 > > > Stopped at db_enter+0x10: popq%rbp > > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > > 423109 57118 55 0x3 02 link > > > 330695 30276 55 0x3 03 link > > > * 46366 85501 55 0x1003 0x40804001 link > > > 188803 85501 55 0x1003 0x40820000K link > > > db_enter() at db_enter+0x10 > > > panic(81f25d2b) at panic+0xbf > > > __assert(81f9a186,81f372c8,175,81f87c6c) at > > > __assert+0x25 > > > sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at > > > sleep_setup+0x1d8 > > > cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46 > > > timeout_barrier(8000228a28b0) at timeout_barrier+0x109 > > > timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2 > > > sleep_finish(800022d64d90,1) at sleep_finish+0x16d > > > tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2 > > > sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at > > > sys_nanosleep+0x12d > > > syscall(800022d64f60) at syscall+0x374 > > > > > > The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, > > > soft-interrupt-driven timeouts could be deleted synchronously without > > > blocking. Now, timeout_del_barrier() can sleep regardless of the type > > > of the timeout. > > > > > > It looks that with small adjustments timeout_del_barrier() can sleep > > > in sleep_finish(). The management of run queues is not affected because > > > the timeout clearing happens after it. As timeout_del_barrier() does not > > > rely on a timeout or signal catching, there should be no risk of > > > unbounded recursion or unwanted signal side effects within the sleep > > > machinery. In a way, a sleep with a timeout is higher-level than > > > one without. > > > > I trust you on the analysis. However this looks very fragile to me. > > > > The use of timeout_del_barrier() which can sleep using the global sleep > > queue is worrying me. > > I think the queue handling ends in sleep_finish() when SCHED_LOCK() > is released. The timeout clearing is done outside of it. That's ok. > The extra sleeping point inside sleep_finish() is subtle. It should not > matter in typical use. But is it permissible with the API? Also, if > timeout_del_barrier() sleeps, the thread's priority can change. What other options do we have at this point? Spin? Allocate the timeout dynamically so sleep_finish doesn't have to wait for it and let the handler clean up? How would you stop the timeout handler waking up the thread if it's gone back to sleep again for some other reason? Sleeping here is the least worst option. As for timeout_del_barrier, if prio is a worry we can provide an advanced version of it that lets you pass the prio in. I'd also like to change timeout_barrier so it queues the barrier task at the head of the pending lists rather than at the tail. > Note that sleep_finish() already can take an additional nap when > signal catching is enabled. > > > > Note that endtsleep() can run and set P_TIMEOUT during > > > timeout_del_barrier() when the thread is blocked in cond_wait(). > > > To avoid unnecessary atomic read-modify-write operations, the clearing > > > of P_TIMEOUT could be conditional, but maybe that is an unnecessary > > > optimization at this point. > > > > I agree this optimization seems unnecessary at the moment. > > > > > While it should be possible to make the code use timeout_del() instead > > > of timeout_del_barrier(), the outcome might not be outright better. For > > > example, sleep_setup() and endtsleep() would have to coordinate so that > > > a late-running timeout from previous sleep cycle would not disturb the > > > new cycle. > > > > So that's the price for not having to sleep in sleep_finish(), right? > > That is correct. Some synchronization is needed in any case. > > > > To test the barrier path reliably, I made the code call > > > timeout_del_barrier() twice in a row. The second call is guaranteed > > > to sleep. Of course, this is not part of the patch. > > > > ok mpi@ > > > > > Index: kern/kern_synch.c > > > === > > > RCS file: src/sys/kern/kern_synch.c,v > > > retrieving revision 1.187 > > > diff -u -p -r1.187 kern_synch.c > > > --- kern/kern_synch.c 13 May 2022 15:32:00 - 1.187 > > > +++ kern/kern_synch.c 5 Jun 2022 05:04:45 - > > > @@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con > > > p->p_slppri = prio & PRIMASK; > > > TAILQ_INSERT_TAIL(&slpque[LOOK
Re: Fix clearing of sleep timeouts
On Sun, Jun 05, 2022 at 12:27:32PM +0200, Martin Pieuchot wrote: > On 05/06/22(Sun) 05:20, Visa Hankala wrote: > > Encountered the following panic: > > > > panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: > > file "/usr/src/sys/kern/kern_synch.c", line 373 > > Stopped at db_enter+0x10: popq%rbp > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > 423109 57118 55 0x3 02 link > > 330695 30276 55 0x3 03 link > > * 46366 85501 55 0x1003 0x40804001 link > > 188803 85501 55 0x1003 0x40820000K link > > db_enter() at db_enter+0x10 > > panic(81f25d2b) at panic+0xbf > > __assert(81f9a186,81f372c8,175,81f87c6c) at > > __assert+0x25 > > sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at > > sleep_setup+0x1d8 > > cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46 > > timeout_barrier(8000228a28b0) at timeout_barrier+0x109 > > timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2 > > sleep_finish(800022d64d90,1) at sleep_finish+0x16d > > tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2 > > sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at > > sys_nanosleep+0x12d > > syscall(800022d64f60) at syscall+0x374 > > > > The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, > > soft-interrupt-driven timeouts could be deleted synchronously without > > blocking. Now, timeout_del_barrier() can sleep regardless of the type > > of the timeout. > > > > It looks that with small adjustments timeout_del_barrier() can sleep > > in sleep_finish(). The management of run queues is not affected because > > the timeout clearing happens after it. As timeout_del_barrier() does not > > rely on a timeout or signal catching, there should be no risk of > > unbounded recursion or unwanted signal side effects within the sleep > > machinery. In a way, a sleep with a timeout is higher-level than > > one without. > > I trust you on the analysis. However this looks very fragile to me. > > The use of timeout_del_barrier() which can sleep using the global sleep > queue is worrying me. I think the queue handling ends in sleep_finish() when SCHED_LOCK() is released. The timeout clearing is done outside of it. The extra sleeping point inside sleep_finish() is subtle. It should not matter in typical use. But is it permissible with the API? Also, if timeout_del_barrier() sleeps, the thread's priority can change. Note that sleep_finish() already can take an additional nap when signal catching is enabled. > > Note that endtsleep() can run and set P_TIMEOUT during > > timeout_del_barrier() when the thread is blocked in cond_wait(). > > To avoid unnecessary atomic read-modify-write operations, the clearing > > of P_TIMEOUT could be conditional, but maybe that is an unnecessary > > optimization at this point. > > I agree this optimization seems unnecessary at the moment. > > > While it should be possible to make the code use timeout_del() instead > > of timeout_del_barrier(), the outcome might not be outright better. For > > example, sleep_setup() and endtsleep() would have to coordinate so that > > a late-running timeout from previous sleep cycle would not disturb the > > new cycle. > > So that's the price for not having to sleep in sleep_finish(), right? That is correct. Some synchronization is needed in any case. > > To test the barrier path reliably, I made the code call > > timeout_del_barrier() twice in a row. The second call is guaranteed > > to sleep. Of course, this is not part of the patch. > > ok mpi@ > > > Index: kern/kern_synch.c > > === > > RCS file: src/sys/kern/kern_synch.c,v > > retrieving revision 1.187 > > diff -u -p -r1.187 kern_synch.c > > --- kern/kern_synch.c 13 May 2022 15:32:00 - 1.187 > > +++ kern/kern_synch.c 5 Jun 2022 05:04:45 - > > @@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con > > p->p_slppri = prio & PRIMASK; > > TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_runq); > > > > - KASSERT((p->p_flag & P_TIMEOUT) == 0); > > if (timo) { > > + KASSERT((p->p_flag & P_TIMEOUT) == 0); > > sls->sls_timeout = 1; > > timeout_add(&p->p_sleep_to, timo); > > } > > @@ -432,13 +432,12 @@ sleep_finish(struct sleep_state *sls, in > > > > if (sls->sls_timeout) { > > if (p->p_flag & P_TIMEOUT) { > > - atomic_clearbits_int(&p->p_flag, P_TIMEOUT); > > error1 = EWOULDBLOCK; > > } else { > > - /* This must not sleep. */ > > + /* This can sleep. It must not use timeouts. */ > > timeout_del_barrier(&p->p_sleep_to); > > - KASSERT((p->p_flag & P_TIMEOUT) == 0); > > }
Re: Fix clearing of sleep timeouts
On 05/06/22(Sun) 05:20, Visa Hankala wrote: > Encountered the following panic: > > panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: > file "/usr/src/sys/kern/kern_synch.c", line 373 > Stopped at db_enter+0x10: popq%rbp > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > 423109 57118 55 0x3 02 link > 330695 30276 55 0x3 03 link > * 46366 85501 55 0x1003 0x40804001 link > 188803 85501 55 0x1003 0x40820000K link > db_enter() at db_enter+0x10 > panic(81f25d2b) at panic+0xbf > __assert(81f9a186,81f372c8,175,81f87c6c) at > __assert+0x25 > sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at > sleep_setup+0x1d8 > cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46 > timeout_barrier(8000228a28b0) at timeout_barrier+0x109 > timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2 > sleep_finish(800022d64d90,1) at sleep_finish+0x16d > tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2 > sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at > sys_nanosleep+0x12d > syscall(800022d64f60) at syscall+0x374 > > The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, > soft-interrupt-driven timeouts could be deleted synchronously without > blocking. Now, timeout_del_barrier() can sleep regardless of the type > of the timeout. > > It looks that with small adjustments timeout_del_barrier() can sleep > in sleep_finish(). The management of run queues is not affected because > the timeout clearing happens after it. As timeout_del_barrier() does not > rely on a timeout or signal catching, there should be no risk of > unbounded recursion or unwanted signal side effects within the sleep > machinery. In a way, a sleep with a timeout is higher-level than > one without. I trust you on the analysis. However this looks very fragile to me. The use of timeout_del_barrier() which can sleep using the global sleep queue is worrying me. > Note that endtsleep() can run and set P_TIMEOUT during > timeout_del_barrier() when the thread is blocked in cond_wait(). > To avoid unnecessary atomic read-modify-write operations, the clearing > of P_TIMEOUT could be conditional, but maybe that is an unnecessary > optimization at this point. I agree this optimization seems unnecessary at the moment. > While it should be possible to make the code use timeout_del() instead > of timeout_del_barrier(), the outcome might not be outright better. For > example, sleep_setup() and endtsleep() would have to coordinate so that > a late-running timeout from previous sleep cycle would not disturb the > new cycle. So that's the price for not having to sleep in sleep_finish(), right? > To test the barrier path reliably, I made the code call > timeout_del_barrier() twice in a row. The second call is guaranteed > to sleep. Of course, this is not part of the patch. ok mpi@ > Index: kern/kern_synch.c > === > RCS file: src/sys/kern/kern_synch.c,v > retrieving revision 1.187 > diff -u -p -r1.187 kern_synch.c > --- kern/kern_synch.c 13 May 2022 15:32:00 - 1.187 > +++ kern/kern_synch.c 5 Jun 2022 05:04:45 - > @@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con > p->p_slppri = prio & PRIMASK; > TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_runq); > > - KASSERT((p->p_flag & P_TIMEOUT) == 0); > if (timo) { > + KASSERT((p->p_flag & P_TIMEOUT) == 0); > sls->sls_timeout = 1; > timeout_add(&p->p_sleep_to, timo); > } > @@ -432,13 +432,12 @@ sleep_finish(struct sleep_state *sls, in > > if (sls->sls_timeout) { > if (p->p_flag & P_TIMEOUT) { > - atomic_clearbits_int(&p->p_flag, P_TIMEOUT); > error1 = EWOULDBLOCK; > } else { > - /* This must not sleep. */ > + /* This can sleep. It must not use timeouts. */ > timeout_del_barrier(&p->p_sleep_to); > - KASSERT((p->p_flag & P_TIMEOUT) == 0); > } > + atomic_clearbits_int(&p->p_flag, P_TIMEOUT); > } > > /* Check if thread was woken up because of a unwind or signal */ >
Fix clearing of sleep timeouts
Encountered the following panic: panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: file "/usr/src/sys/kern/kern_synch.c", line 373 Stopped at db_enter+0x10: popq%rbp TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 423109 57118 55 0x3 02 link 330695 30276 55 0x3 03 link * 46366 85501 55 0x1003 0x40804001 link 188803 85501 55 0x1003 0x40820000K link db_enter() at db_enter+0x10 panic(81f25d2b) at panic+0xbf __assert(81f9a186,81f372c8,175,81f87c6c) at __assert+0x25 sleep_setup(800022d64bf8,800022d64c98,20,81f66ac6,0) at sleep_setup+0x1d8 cond_wait(800022d64c98,81f66ac6) at cond_wait+0x46 timeout_barrier(8000228a28b0) at timeout_barrier+0x109 timeout_del_barrier(8000228a28b0) at timeout_del_barrier+0xa2 sleep_finish(800022d64d90,1) at sleep_finish+0x16d tsleep(823a5130,120,81f0b730,2) at tsleep+0xb2 sys_nanosleep(8000228a27f0,800022d64ea0,800022d64ef0) at sys_nanosleep+0x12d syscall(800022d64f60) at syscall+0x374 The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously, soft-interrupt-driven timeouts could be deleted synchronously without blocking. Now, timeout_del_barrier() can sleep regardless of the type of the timeout. It looks that with small adjustments timeout_del_barrier() can sleep in sleep_finish(). The management of run queues is not affected because the timeout clearing happens after it. As timeout_del_barrier() does not rely on a timeout or signal catching, there should be no risk of unbounded recursion or unwanted signal side effects within the sleep machinery. In a way, a sleep with a timeout is higher-level than one without. Note that endtsleep() can run and set P_TIMEOUT during timeout_del_barrier() when the thread is blocked in cond_wait(). To avoid unnecessary atomic read-modify-write operations, the clearing of P_TIMEOUT could be conditional, but maybe that is an unnecessary optimization at this point. While it should be possible to make the code use timeout_del() instead of timeout_del_barrier(), the outcome might not be outright better. For example, sleep_setup() and endtsleep() would have to coordinate so that a late-running timeout from previous sleep cycle would not disturb the new cycle. To test the barrier path reliably, I made the code call timeout_del_barrier() twice in a row. The second call is guaranteed to sleep. Of course, this is not part of the patch. OK? Index: kern/kern_synch.c === RCS file: src/sys/kern/kern_synch.c,v retrieving revision 1.187 diff -u -p -r1.187 kern_synch.c --- kern/kern_synch.c 13 May 2022 15:32:00 - 1.187 +++ kern/kern_synch.c 5 Jun 2022 05:04:45 - @@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con p->p_slppri = prio & PRIMASK; TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_runq); - KASSERT((p->p_flag & P_TIMEOUT) == 0); if (timo) { + KASSERT((p->p_flag & P_TIMEOUT) == 0); sls->sls_timeout = 1; timeout_add(&p->p_sleep_to, timo); } @@ -432,13 +432,12 @@ sleep_finish(struct sleep_state *sls, in if (sls->sls_timeout) { if (p->p_flag & P_TIMEOUT) { - atomic_clearbits_int(&p->p_flag, P_TIMEOUT); error1 = EWOULDBLOCK; } else { - /* This must not sleep. */ + /* This can sleep. It must not use timeouts. */ timeout_del_barrier(&p->p_sleep_to); - KASSERT((p->p_flag & P_TIMEOUT) == 0); } + atomic_clearbits_int(&p->p_flag, P_TIMEOUT); } /* Check if thread was woken up because of a unwind or signal */