Hello Christian,

Thanks for this interesting report.  You seem to have found a case where
pool_get(9) with PR_NOWAIT might sleep...  See below.

On 13/01/26(Tue) 15:00, Christian Ludwig wrote:
> Hi,
> 
> I ran into the following panic on my Raspberry Pi Zero2W when compiling
> the kernel on a 2026-01-11 snapshot. Unfortunately, I do not know how to
> fix this.
> 
> 
>  - Christian
> 
> panic: assertwaitok: non-zero mutex count: 1
> Stopped at      db_enter+0x18:  brk     #0xf000
>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
>  433851  76208   1000         0x3          0    0  ccache
> *156464  96163   1000         0x3          0    2  cc
>  147282  40014   1000         0x3          0    1  cc
>  121706  68538      0     0x14000      0x200    3  sdmmc0
> db_enter() at panic+0x138
> panic() at assertwaitok+0xb8
> assertwaitok() at pool_get+0x34
> pool_get() at uvm_mapent_alloc+0x20c
> uvm_mapent_alloc() at uvm_map_clip_start+0x80
> uvm_map_clip_start() at uvm_unmap_remove+0x248
> uvm_unmap_remove() at uvm_unmap+0x64
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{2}> trace
> db_enter() at panic+0x138
> panic() at assertwaitok+0xb8
> assertwaitok() at pool_get+0x34
> pool_get() at uvm_mapent_alloc+0x20c

This seems to be the pool_get(9) for the kernel_map case.

> uvm_mapent_alloc() at uvm_map_clip_start+0x80
> uvm_map_clip_start() at uvm_unmap_remove+0x248

The issue here is that uvm_map_clip_start() expect uvm_mapent_alloc() to
return a new entry and possibly sleep.

> uvm_unmap_remove() at uvm_unmap+0x64
> uvm_unmap() at km_free+0x50
> km_free() at pool_p_alloc+0x1f4

I believe this km_free() correspond to pool_allocator_free() line 935 of
kern/subr_pool.c.


It's not clear to me how we should address this issue.  Having a km_free(9)
that is less likely to sleep seems a good thing to me.  However it's not
clear how to do that while uvm_unmap_remove() might need to clip entries
and clipping needs to allocate.

Maybe this could be worked around at the pool level?

Mark, David, what do you think?

> pool_p_alloc() at pool_do_get+0x20c
> pool_do_get() at pool_get+0x8c
> pool_get() at pmap_vp_enter+0x17c
> pmap_vp_enter() at pmap_enter+0x1ac
> pmap_enter() at uvm_fault_lower+0x220
> uvm_fault_lower() at uvm_fault+0x158
> uvm_fault() at udata_abort+0x128
> udata_abort() at do_el0_sync+0x100
> do_el0_sync() at handle_el0_sync+0x70
> handle_el0_sync() at __ALIGN_SIZE+0x4b689f8
> --- trap ---
> end of kernel
> ddb{2}> show uvm
> Current UVM status:
>   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
>   101363 VM pages: 42267 active, 18301 inactive, 1 wired, 7 free (1 zero)
>   freemin=3378, free-target=4504, inactive-target=20000, wired-max=33787
>   faults=10832044, traps=74264382, intrs=2185293, ctxswitch=1898507 
> fpuswitch=0
>   softint=881358, syscalls=54181390, kmapent=15
>   fault counts:
>     noram=46834246, noanon=0, noamap=0, pgwait=0, pgrele=0
>     relocks=135632(521), upgrades=454828(1009) anget(retries)=5338133(0), 
> amapcopy=1590537
>     neighbor anon/obj pg=2294476/7526313, gets(lock/unlock)=2493095/137450
>     cases: anon=4664089, anoncow=674044, obj=2156623, prcopy=333643, 
> przero=3003868
>   daemon and swap counts:
>     woke=24230, revs=241, scans=14585, obscans=5500, anscans=9078
>     busy=0, freed=6930, reactivate=0, deactivate=22337
>     pageouts=376, pending=375, nswget=0
>     nswapdev=1
>     swpages=2097152, swpginuse=5995, swpgonly=3083 paging=16
>   kernel pointers:
>     objs(kern)=0xffffff80012aae08
> 


Reply via email to