On Mon, Dec 01, 2025 at 03:23:18PM +0100, Martin Pieuchot wrote: > Thanks a lot for this report. It helps me a lot to understand the > existing limitation of OpenBSD's pdaemon.
It passes regress on i386 with the machine that paniced before. I tried make release with this diff. After some time I lost the ssh connection. Note that the SSH timeout is configured rather short in my setup. This happened quite often before. It is a short time hang, the machine reacts normaly after a while. ===> gnu/usr.bin/clang/include/llvm/X86 /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-subtarget -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenSubtargetInfo.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-register-info -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenRegisterInfo.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-register-bank -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenRegisterBank.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-x86-mnemonic-tables -asmwriternum=1 -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenMnemonicTables.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-instr-info -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenInstrInfo.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-global-isel -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenGlobalISel.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-fast-isel -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenFastISel.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen -gen-exegesis -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 -o X86GenExegesis.inc /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td Timeout, server ot2 not responding. So I tried to rebuild the directory src/gnu/usr.bin/clang/include/llvm/X86 and then a top(1) output got stuck for a while and later continued. load averages: 1.74, 0.75, 1.13 ot2.obsd-lab.genua.de 17:59:55 53 processes: 52 idle, 1 on processor up 0 days 01:06:29 CPU0: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU1: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU2: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU3: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU4: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU5: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU6: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle CPU7: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 100% idle Memory: Real: 2235M/3223M act/tot Free: 32K Cache: 161M Swap: 399M/3556M PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 2595 root -18 0 432M 123M sleep/6 flt_nor 0:16 28.32% llvm-tblgen 56241 root -18 0 431M 112M sleep/6 flt_nor 0:15 26.90% llvm-tblgen 40502 root -18 0 431M 105M sleep/6 flt_nor 0:15 26.61% llvm-tblgen 61936 root -18 0 459M 54M sleep/6 flt_nor 0:15 25.78% llvm-tblgen 54450 root -18 0 431M 111M sleep/7 flt_nor 0:15 25.63% llvm-tblgen 33131 root -18 0 431M 70M sleep/6 flt_nor 0:15 25.59% llvm-tblgen 32823 root -18 0 602M 208M sleep/6 flt_nor 0:14 22.75% llvm-tblgen 14235 root 2 0 1628K 1212K sleep/1 kqread 0:03 5.62% sshd-sessio 91133 root 2 0 2280K 2072K sleep/3 kqread 0:02 3.27% tmux At least no crash, but it is hard to tell if situation improved. I have seen such hangs before, this is not a regression. On the 12 CPU machine where I saw the crash before I am nearly hitting end of swap. Maybe that is another can of worms. load averages: 0.17, 0.26, 1.04 ot4.obsd-lab.genua.de 20:02:31 86 processes: 85 idle, 1 on processor up 0 days 03:45:33 12 CPUs: 0.0% user, 0.0% nice, 0.8% sys, 0.0% spin, 0.0% intr, 99.2% idle Memory: Real: 1653M/2858M act/tot Free: 133M Cache: 149M Swap: 3272M/3319M PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 93949 build -5 0 458M 235M sleep/4 biowait 0:19 0.68% llvm-tblgen 65515 build -5 0 463M 105M sleep/3 biowait 0:21 0.24% llvm-tblgen 36109 build -5 0 767M 324M sleep/9 biowait 0:20 0.05% llvm-tblgen 89976 build -5 0 465M 111M sleep/2 biowait 0:20 0.05% llvm-tblgen 77887 build -5 0 458M 206M sleep/6 biowait 0:19 0.05% llvm-tblgen 43170 build -5 0 132M 8652K sleep/7 biowait 0:19 0.05% llvm-tblgen 98370 build -5 0 463M 91M sleep/2 biowait 0:19 0.05% llvm-tblgen 13101 root 2 0 1684K 808K sleep/2 kqread 0:06 0.05% sshd-sessio 54645 root 29 0 1360K 2184K onproc/1 - 0:01 0.05% top 47918 build -5 0 427M 244M sleep/11 biowait 0:21 0.00% llvm-tblgen 6953 build -5 0 465M 100M sleep/3 biowait 0:19 0.00% llvm-tblgen 54170 build -5 0 95M 6096K sleep/5 biowait 0:19 0.00% llvm-tblgen 78266 build -5 0 154M 8956K sleep/2 biowait 0:18 0.00% llvm-tblgen 24448 build -5 0 127M 7016K sleep/2 biowait 0:18 0.00% llvm-tblgen 77594 root 2 0 3532K 2348K sleep/1 kqread 0:11 0.00% tmux 31856 _snmpd 2 0 4652K 908K sleep/1 kqread 0:01 0.00% snmpd 75039 _syslogd 2 0 1352K 568K sleep/2 kqread 0:01 0.00% syslogd 11862 root 2 0 980K 1464K idle kqread 0:00 0.00% sshd When looking at the syzkaller mailing list or my testing, I have the impression that we have more stability problems in the kernel than usual. But they occure randomly, it is hard to find the moment when they started or what caused them. I cannot test a single diff in this area say if it fixes anything. I might find regressions. The only way to move forward is to fix bugs, commit them, and look at the stability of all test machines. This includes syzkaller, my setup, anton's machines, and of course all the snapshot users who run current. bluhm > On 29/11/25(Sat) 00:45, Alexander Bluhm wrote: > > Hi, > > > > My i386 test machine crashed during make build. > > > > ===> gnu/usr.bin/clang/include/llvm/X86 > > ... > > /usr/src/gnu/usr.bin/clang/include/llvm/X86/obj/../../../llvm-tblgen/llvm-tblgen > > -gen-asm-writer -asmwriternum=1 > > -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/include > > > > -I/usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86 > > -o X86GenAsmWriter1.inc > > /usr/src/gnu/usr.bin/clang/include/llvm/X86/../../../../../llvm/llvm/lib/Target/X86/X86.td > > Timeout, server ot4 not responding. > > 2025-11-28T16:04:30Z Command 'ssh root@ot4 cd /usr/src && time nice make -j > > 13 build' failed: 65280 at /data/test/regress/bin/Machine.pm line 232. > > > > panic: uao_find_swhash_elt: can't allocate entry > > Stopped at db_enter+0x4: popl %ebp > > TID PID UID PRFLAGS PFLAGS CPU COMMAND > > *294827 5534 0 0x14000 0x200 11K pagedaemon > > db_enter() at db_enter+0x4 > > panic(d0c972a3) at panic+0x7a > > uao_set_swslot(d1044a74,26e9a,1a63) at uao_set_swslot+0x184 > > uvmpd_scan_inactive(0,15a3) at uvmpd_scan_inactive+0x7b7 > > uvmpd_scan(0,15a3,73a) at uvmpd_scan+0x3a > > uvm_pageout(d6c4f1b0) at uvm_pageout+0x29b > > https://www.openbsd.org/ddb.html describes the minimum info required in bug > > reports. Insufficient info makes it difficult to find and fix bugs. > > Diff below brings some more glue from NetBSD to avoid this problem. It > removes the incorrect panic from oga@. The page daemon can hit such limit > because releasing pages is asynchronous. > > Could you please give it a go and report the next panic you find? > > Thanks a lot! > > Index: uvm/uvm_pdaemon.c > =================================================================== > RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v > diff -u -p -r1.138 uvm_pdaemon.c > --- uvm/uvm_pdaemon.c 5 Oct 2025 14:13:22 -0000 1.138 > +++ uvm/uvm_pdaemon.c 1 Dec 2025 14:04:07 -0000 > @@ -414,6 +414,94 @@ uvmpd_trylockowner(struct vm_page *pg) > return slock; > } > > +struct swapcluster { > + int swc_slot; > + int swc_nallocated; > + int swc_nused; > + struct vm_page *swc_pages[SWCLUSTPAGES]; > +}; > + > +void > +swapcluster_init(struct swapcluster *swc) > +{ > + swc->swc_slot = 0; > + swc->swc_nused = 0; > +} > + > +int > +swapcluster_allocslots(struct swapcluster *swc) > +{ > + int slot, npages; > + > + if (swc->swc_slot != 0) > + return 0; > + > + npages = SWCLUSTPAGES; > + slot = uvm_swap_alloc(&npages, TRUE); > + if (slot == 0) > + return ENOMEM; > + > + swc->swc_slot = slot; > + swc->swc_nallocated = npages; > + swc->swc_nused = 0; > + > + return 0; > +} > + > +int > +swapcluster_add(struct swapcluster *swc, struct vm_page *pg) > +{ > + int slot; > + struct uvm_object *uobj; > + > + KASSERT(swc->swc_slot != 0); > + KASSERT(swc->swc_nused < swc->swc_nallocated); > + KASSERT((pg->pg_flags & PQ_SWAPBACKED) != 0); > + > + slot = swc->swc_slot + swc->swc_nused; > + uobj = pg->uobject; > + if (uobj == NULL) { > + KASSERT(rw_write_held(pg->uanon->an_lock)); > + pg->uanon->an_swslot = slot; > + } else { > + int result; > + > + KASSERT(rw_write_held(uobj->vmobjlock)); > + result = uao_set_swslot(uobj, pg->offset >> PAGE_SHIFT, slot); > + if (result == -1) > + return ENOMEM; > + } > + swc->swc_pages[swc->swc_nused] = pg; > + swc->swc_nused++; > + > + return 0; > +} > + > +void > +swapcluster_flush(struct swapcluster *swc) > +{ > + int slot; > + int nused; > + int nallocated; > + > + if (swc->swc_slot == 0) > + return; > + KASSERT(swc->swc_nused <= swc->swc_nallocated); > + > + slot = swc->swc_slot; > + nused = swc->swc_nused; > + nallocated = swc->swc_nallocated; > + > + if (nused < nallocated) > + uvm_swap_free(slot + nused, nallocated - nused); > +} > + > +static inline int > +swapcluster_nused(struct swapcluster *swc) > +{ > + return swc->swc_nused; > +} > + > /* > * uvmpd_dropswap: free any swap allocated to this page. > * > @@ -497,10 +585,8 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > struct uvm_object *uobj; > struct vm_page *pps[SWCLUSTPAGES], **ppsp; > int npages; > - struct vm_page *swpps[SWCLUSTPAGES]; /* XXX: see below */ > + struct swapcluster swc; > struct rwlock *slock; > - int swnpages, swcpages; /* XXX: see below */ > - int swslot; > struct vm_anon *anon; > boolean_t swap_backed; > vaddr_t start; > @@ -511,8 +597,7 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > * to stay in the loop while we have a page to scan or we have > * a swap-cluster to build. > */ > - swslot = 0; > - swnpages = swcpages = 0; > + swapcluster_init(&swc); > dirtyreacts = 0; > p = NULL; > > @@ -532,7 +617,7 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > > /* Insert iterator. */ > TAILQ_INSERT_AFTER(pglst, p, &iter, pageq); > - for (; p != NULL || swslot != 0; p = uvmpd_iterator(pglst, p, &iter)) { > + for (; p != NULL || swc.swc_slot != 0; p = uvmpd_iterator(pglst, p, > &iter)) { > /* > * note that p can be NULL iff we have traversed the whole > * list and need to do one final swap-backed clustered pageout. > @@ -544,9 +629,10 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > * see if we've met our target > */ > if ((uvmpd_pma_done(pma) && > - (uvmexp.paging >= (shortage - freed))) || > + (uvmexp.paging + swapcluster_nused(&swc) > + >= (shortage - freed))) || > dirtyreacts == UVMPD_NUMDIRTYREACTS) { > - if (swslot == 0) { > + if (swc.swc_slot == 0) { > /* exit now if no swap-i/o pending */ > break; > } > @@ -701,35 +787,30 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > uvmpd_dropswap(p); > > /* start new cluster (if necessary) */ > - if (swslot == 0) { > - swnpages = SWCLUSTPAGES; > - swslot = uvm_swap_alloc(&swnpages, > - TRUE); > - if (swslot == 0) { > - /* no swap? give up! */ > - atomic_clearbits_int( > - &p->pg_flags, > - PG_BUSY); > - UVM_PAGE_OWN(p, NULL); > - rw_exit(slock); > - continue; > - } > - swcpages = 0; /* cluster is empty */ > + if (swapcluster_allocslots(&swc)) { > + atomic_clearbits_int(&p->pg_flags, > + PG_BUSY); > + UVM_PAGE_OWN(p, NULL); > + dirtyreacts++; > + uvm_pageactivate(p); > + rw_exit(slock); > + continue; > } > > /* add block to cluster */ > - swpps[swcpages] = p; > - if (anon) > - anon->an_swslot = swslot + swcpages; > - else > - uao_set_swslot(uobj, > - p->offset >> PAGE_SHIFT, > - swslot + swcpages); > - swcpages++; > + if (swapcluster_add(&swc, p)) { > + atomic_clearbits_int(&p->pg_flags, > + PG_BUSY); > + UVM_PAGE_OWN(p, NULL); > + dirtyreacts++; > + uvm_pageactivate(p); > + rw_exit(slock); > + continue; > + } > rw_exit(slock); > > /* cluster not full yet? */ > - if (swcpages < swnpages) > + if (swc.swc_nused < swc.swc_nallocated) > continue; > } > } else { > @@ -748,17 +829,14 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > */ > if (swap_backed) { > /* starting I/O now... set up for it */ > - npages = swcpages; > - ppsp = swpps; > + npages = swc.swc_nused; > + ppsp = swc.swc_pages; > /* for swap-backed pages only */ > - start = (vaddr_t) swslot; > + start = (vaddr_t) swc.swc_slot; > > /* if this is final pageout we could have a few > * extra swap blocks */ > - if (swcpages < swnpages) { > - uvm_swap_free(swslot + swcpages, > - (swnpages - swcpages)); > - } > + swapcluster_flush(&swc); > } else { > /* normal object pageout */ > ppsp = pps; > @@ -794,9 +872,8 @@ uvmpd_scan_inactive(struct uvm_pmalloc * > * if we did i/o to swap, zero swslot to indicate that we are > * no longer building a swap-backed cluster. > */ > - > if (swap_backed) > - swslot = 0; /* done with this cluster */ > + swapcluster_init(&swc); /* done with this cluster */ > > /* > * first, we check for VM_PAGER_PEND which means that the > Index: uvm/uvm_aobj.c > =================================================================== > RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v > diff -u -p -r1.119 uvm_aobj.c > --- uvm/uvm_aobj.c 10 Nov 2025 10:53:53 -0000 1.119 > +++ uvm/uvm_aobj.c 1 Dec 2025 11:15:02 -0000 > @@ -142,7 +142,7 @@ struct uvm_aobj { > struct pool uvm_aobj_pool; > > static struct uao_swhash_elt *uao_find_swhash_elt(struct uvm_aobj *, int, > - boolean_t); > + boolean_t, boolean_t); > static boolean_t uao_flush(struct uvm_object *, voff_t, > voff_t, int); > static void uao_free(struct uvm_aobj *); > @@ -197,10 +197,12 @@ static struct mutex uao_list_lock = MUTE > * offset. > */ > static struct uao_swhash_elt * > -uao_find_swhash_elt(struct uvm_aobj *aobj, int pageidx, boolean_t create) > +uao_find_swhash_elt(struct uvm_aobj *aobj, int pageidx, boolean_t create, > + boolean_t wait) > { > struct uao_swhash *swhash; > struct uao_swhash_elt *elt; > + int waitf = wait ? PR_WAITOK : PR_NOWAIT; > voff_t page_tag; > > swhash = UAO_SWHASH_HASH(aobj, pageidx); /* first hash to get bucket */ > @@ -220,17 +222,9 @@ uao_find_swhash_elt(struct uvm_aobj *aob > /* > * allocate a new entry for the bucket and init/insert it in > */ > - elt = pool_get(&uao_swhash_elt_pool, PR_NOWAIT | PR_ZERO); > - /* > - * XXX We cannot sleep here as the hash table might disappear > - * from under our feet. And we run the risk of deadlocking > - * the pagedeamon. In fact this code will only be called by > - * the pagedaemon and allocation will only fail if we > - * exhausted the pagedeamon reserve. In that case we're > - * doomed anyway, so panic. > - */ > + elt = pool_get(&uao_swhash_elt_pool, waitf | PR_ZERO); > if (elt == NULL) > - panic("%s: can't allocate entry", __func__); > + return NULL; > LIST_INSERT_HEAD(swhash, elt, list); > elt->tag = page_tag; > > @@ -258,7 +252,7 @@ uao_find_swslot(struct uvm_object *uobj, > */ > if (UAO_USES_SWHASH(aobj)) { > struct uao_swhash_elt *elt = > - uao_find_swhash_elt(aobj, pageidx, FALSE); > + uao_find_swhash_elt(aobj, pageidx, FALSE, FALSE); > > if (elt) > return UAO_SWHASH_ELT_PAGESLOT(elt, pageidx); > @@ -284,6 +278,7 @@ int > uao_set_swslot(struct uvm_object *uobj, int pageidx, int slot) > { > struct uvm_aobj *aobj = (struct uvm_aobj *)uobj; > + struct uao_swhash_elt *elt; > int oldslot; > > KASSERT(rw_write_held(uobj->vmobjlock) || uobj->uo_refs == 0); > @@ -310,11 +305,9 @@ uao_set_swslot(struct uvm_object *uobj, > * the page had not swap slot in the first place, and > * we are freeing. > */ > - struct uao_swhash_elt *elt = > - uao_find_swhash_elt(aobj, pageidx, slot ? TRUE : FALSE); > + elt = uao_find_swhash_elt(aobj, pageidx, slot != 0, FALSE); > if (elt == NULL) { > - KASSERT(slot == 0); > - return 0; > + return slot ? - 1 : 0; > } > > oldslot = UAO_SWHASH_ELT_PAGESLOT(elt, pageidx); > @@ -465,7 +458,7 @@ uao_shrink_convert(struct uvm_object *uo > > /* Convert swap slots from hash to array. */ > for (i = 0; i < pages; i++) { > - elt = uao_find_swhash_elt(aobj, i, FALSE); > + elt = uao_find_swhash_elt(aobj, i, FALSE, FALSE); > if (elt != NULL) { > new_swslots[i] = UAO_SWHASH_ELT_PAGESLOT(elt, i); > if (new_swslots[i] != 0) > @@ -622,12 +615,12 @@ uao_grow_convert(struct uvm_object *uobj > > /* Set these now, so we can use uao_find_swhash_elt(). */ > old_swslots = aobj->u_swslots; > - aobj->u_swhash = new_swhash; > + aobj->u_swhash = new_swhash; > aobj->u_swhashmask = new_hashmask; > > for (i = 0; i < aobj->u_pages; i++) { > if (old_swslots[i] != 0) { > - elt = uao_find_swhash_elt(aobj, i, TRUE); > + elt = uao_find_swhash_elt(aobj, i, TRUE, TRUE); > elt->count++; > UAO_SWHASH_ELT_PAGESLOT(elt, i) = old_swslots[i]; > } >
