On 15/01/26(Thu) 14:36, Mark Kettenis wrote:
> > Date: Thu, 15 Jan 2026 11:17:11 +0100
> > From: Martin Pieuchot <[email protected]>
> > 
> > Hello Christian,
> > 
> > Thanks for this interesting report.  You seem to have found a case where
> > pool_get(9) with PR_NOWAIT might sleep...  See below.
> 
> Yes.  And this is a clear violation of contract.  Anyway, see below...
> 
> > On 13/01/26(Tue) 15:00, Christian Ludwig wrote:
> > > Hi,
> > > 
> > > I ran into the following panic on my Raspberry Pi Zero2W when compiling
> > > the kernel on a 2026-01-11 snapshot. Unfortunately, I do not know how to
> > > fix this.
> > > 
> > > 
> > >  - Christian
> > > 
> > > panic: assertwaitok: non-zero mutex count: 1
> > > Stopped at      db_enter+0x18:  brk     #0xf000
> > >     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> > >  433851  76208   1000         0x3          0    0  ccache
> > > *156464  96163   1000         0x3          0    2  cc
> > >  147282  40014   1000         0x3          0    1  cc
> > >  121706  68538      0     0x14000      0x200    3  sdmmc0
> > > db_enter() at panic+0x138
> > > panic() at assertwaitok+0xb8
> > > assertwaitok() at pool_get+0x34
> > > pool_get() at uvm_mapent_alloc+0x20c
> > > uvm_mapent_alloc() at uvm_map_clip_start+0x80
> > > uvm_map_clip_start() at uvm_unmap_remove+0x248
> > > uvm_unmap_remove() at uvm_unmap+0x64
> > > https://www.openbsd.org/ddb.html describes the minimum info required in 
> > > bug
> > > reports.  Insufficient info makes it difficult to find and fix bugs.
> > > ddb{2}> trace
> > > db_enter() at panic+0x138
> > > panic() at assertwaitok+0xb8
> > > assertwaitok() at pool_get+0x34
> > > pool_get() at uvm_mapent_alloc+0x20c
> > 
> > This seems to be the pool_get(9) for the kernel_map case.
> > 
> > > uvm_mapent_alloc() at uvm_map_clip_start+0x80
> > > uvm_map_clip_start() at uvm_unmap_remove+0x248
> > 
> > The issue here is that uvm_map_clip_start() expect uvm_mapent_alloc() to
> > return a new entry and possibly sleep.
> > 
> > > uvm_unmap_remove() at uvm_unmap+0x64
> > > uvm_unmap() at km_free+0x50
> > > km_free() at pool_p_alloc+0x1f4
> > 
> > I believe this km_free() correspond to pool_allocator_free() line 935 of
> > kern/subr_pool.c.
> 
> Yes, and I believe it is the pool_allocator_free() call in
> pool_p_alloc() that is the problem.  That means this is the
> !POOL_INPGHDR(pp) branch where we use a separate pool for the pool
> headers.  Apparently pool_get() on that separate pool fails, so we cool
> pool_allocator_free() to free the pool page that we just allocated.
> 
> Perhaps what we could is change the order and do the pool_get() before
> we call pool_allocator_alloc().  And if that fails, just return NULL.
> This does mean of course that if pool_allocator_alloc() fails we need
> to do a pool_put(), and that might trigger the same issue.  I thought
> there was a mechanism to defer freeing pool pages when we can't sleep,
> but I don't see that the current code.

This works for me.  It allows my arm64 with qwx(4) to boot without
triggering the bug.  I left a FIXME in the code such that we can improve
the new error code path to make sure km_free(9) is not called in this
case.

This also raise the question if all pool_put(9) are safe to call
km_free(9)...

ok?

Index: kern/subr_pool.c
===================================================================
RCS file: /cvs/src/sys/kern/subr_pool.c,v
diff -u -p -r1.242 subr_pool.c
--- kern/subr_pool.c    1 Aug 2025 19:00:38 -0000       1.242
+++ kern/subr_pool.c    25 Jan 2026 13:04:25 -0000
@@ -923,19 +923,28 @@ pool_p_alloc(struct pool *pp, int flags,
        pl_assert_unlocked(pp, &pp->pr_lock);
        KASSERT(pp->pr_size >= sizeof(*pi));
 
+       if (!POOL_INPGHDR(pp)) {
+               ph = pool_get(&phpool, flags);
+               if (ph == NULL)
+                       return (NULL);
+       }
+
        addr = pool_allocator_alloc(pp, flags, slowdown);
-       if (addr == NULL)
+       if (addr == NULL) {
+               if (!POOL_INPGHDR(pp)) {
+                       /*
+                        * FIXME: this can result in pool_p_free() calling
+                        * pool_allocator_free() then calling km_free(9).
+                        * km_free(9) might need to sleep and this is not
+                        * allowed with PR_NOWAIT.
+                        */
+                       pool_put(&phpool, ph);
+               }
                return (NULL);
+       }
 
        if (POOL_INPGHDR(pp))
                ph = (struct pool_page_header *)(addr + pp->pr_phoffset);
-       else {
-               ph = pool_get(&phpool, flags);
-               if (ph == NULL) {
-                       pool_allocator_free(pp, addr);
-                       return (NULL);
-               }
-       }
 
        XSIMPLEQ_INIT(&ph->ph_items);
        ph->ph_page = addr;


Reply via email to