> Date: Wed, 31 Dec 2025 06:42:00 +0000 > From: Miod Vallat <[email protected]> > > I can still reproduce landisk stalling during a cvs update, after having > baken a full muild and xenocara. This time with the kernel-side > traceback of the process stuck in pmrwait. > > ddb> ps > PID TID PPID UID S FLAGS WAIT COMMAND > 22735 236842 22266 1500 3 0x100003 pmrwait ssh > 22266 24911 8155 1500 3 0x100003 biowait cvs > 25498 116159 90374 1500 3 0x100083 ttyin ksh > 8155 467717 90374 1500 3 0x10008b sigsusp ksh > 90374 138597 98462 1500 3 0x10 biowait sshd-session > 98462 191897 15001 0 3 0x92 kqread sshd-session > 57416 152946 1 0 3 0x100003 biowait getty > 42265 403899 1 0 3 0x100098 kqread cron > 99827 455058 1 0 3 0x100090 kqread inetd > 70128 215727 4854 95 3 0x1100092 kqread smtpd > 8281 217519 4854 103 3 0x1100092 kqread smtpd > 76435 92448 4854 95 3 0x1100092 kqread smtpd > 1915 373935 4854 95 3 0x100092 kqread smtpd > 91271 501146 4854 95 3 0x1100092 kqread smtpd > 36815 115371 4854 95 3 0x1100092 kqread smtpd > 4854 296662 1 0 3 0x100080 kqread smtpd > 15001 426794 1 0 3 0x88 kqread sshd > 5586 268874 0 0 3 0x14280 nfsidl nfsio > 86082 74082 0 0 3 0x14280 nfsidl nfsio > 4201 354622 0 0 3 0x14280 nfsidl nfsio > 9170 1439 0 0 3 0x14280 nfsidl nfsio > 7090 276569 1 0 3 0 biowait ypbind > 47046 449110 1 28 3 0x1100010 biowait portmap > 95989 36830 1 0 3 0x100080 kqread ntpd > 53651 165033 76253 83 3 0x100092 kqread ntpd > 76253 352661 1 83 3 0x1100092 kqread ntpd > 39789 280571 4078 74 3 0x1100092 bpf pflogd > 4078 55888 1 0 3 0x80 sbwait pflogd > 84079 106662 16372 73 3 0x1100090 kqread syslogd > 16372 35927 1 0 3 0x100082 sbwait syslogd > 78962 437988 31797 115 3 0x100092 kqread slaacd > 85919 478332 31797 115 3 0x100092 kqread slaacd > 31797 480049 1 0 3 0x100080 kqread slaacd > 27846 437548 0 0 3 0x14200 bored smr > 41958 344917 0 0 3 0x14200 pgzero zerothread > 6076 129835 0 0 3 0x14200 aiodoned aiodoned > 84866 363355 0 0 3 0x14200 syncer update > 87134 418449 0 0 3 0x14200 cleaner cleaner > 53321 503280 0 0 3 0x14200 reaper reaper > 88581 91939 0 0 3 0x14200 pgdaemon pagedaemon > 45312 521165 0 0 3 0x14200 usbtsk usbtask > 37631 261918 0 0 3 0x14200 usbatsk usbatsk > 18147 310057 0 0 3 0x14200 bored softnet0 > 37766 422281 0 0 3 0x14200 bored systqmp > 83868 57728 0 0 3 0x14200 bored systq > 62182 91648 0 0 3 0x40014200 tmoslp softclock > * 973 455824 0 0 7 0x40014200 idle0 > 1 152093 0 0 3 0x82 wait init > 0 0 -1 0 3 0x10200 scheduler swapper > ddb> show uvmexp > Current UVM status: > pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12 > 14864 VM pages: 3 active, 1 inactive, 1 wired, 10118 free (1280 zero) > freemin=495, free-target=660, inactive-target=661, wired-max=4954 > faults=145296324, traps=74992724, intrs=26647938, ctxswitch=14048753 > fpuswitc > h=0 > softint=12776668, syscalls=74992722, kmapent=8 > fault counts: > noram=996715, noanon=0, noamap=0, pgwait=23949, pgrele=0 > relocks=1867838(14104), upgrades=0(0) anget(retries)=57776611(585752), > amap > copy=23259327 > neighbor anon/obj pg=39185700/89679236, gets(lock/unlock)=26907893/1286410 > cases: anon=47576129, anoncow=10190414, obj=23737048, prcopy=3166521, > przer > o=60563995 > daemon and swap counts: > woke=1566238, revs=284861851, scans=2683067, obscans=506886, > anscans=181199 > 8 > busy=0, freed=1421072, reactivate=363889, deactivate=8877746 > pageouts=179395, pending=144550, nswget=539823 > nswapdev=1 > swpages=4194415, swpginuse=4625, swpgonly=4621 paging=3 > kernel pointers: > objs(kern)=0x8c3cccfc > ddb> show bcstats > Current Buffer Cache status: > numbufs 183 busymapped 1, delwri 0 > kvaslots 185 avail kva slots 184 > bufpages 678, dmapages 678, dirtypages 0 > pendingreads 1, pendingwrites 1 > highflips 0, highflops 0, dmaflips 0 > ddb> tr /t 0t236842 > mi_switch() at mi_switch+0x8a > sleep_finish() at sleep_finish+0xb8 > msleep_nsec() at msleep_nsec+0xdc > uvm_wait_pla() at uvm_wait_pla+0x90 > uvm_pmr_getpages() at uvm_pmr_getpages+0x87a > km_alloc() at km_alloc+0x24c > pool_multi_alloc() at pool_multi_alloc+0x7a > m_pool_alloc() at m_pool_alloc+0x36 > pool_allocator_alloc() at pool_allocator_alloc+0x18 > pool_p_alloc() at pool_p_alloc+0x3e > pool_do_get() at pool_do_get+0x174 > pool_get() at pool_get+0xba > m_clget() at m_clget+0x38 > m_getuio() at m_getuio+0xa6 > sosend() at sosend+0x218 > soo_write() at soo_write+0x28 > dofilewritev() at dofilewritev+0x7e > sys_write() at sys_write+0x40 > syscall() at syscall+0x2ca > (EXPEVT 160; SSR=00000001) at 0x39dd9a30
So you're allocating a (potentially) largish mbuf cluster. The mbuf pools use &kp_dma_contig, which means that they ask for phys contig memory. Even with the smallest (2k) clusters I believe the pool allocates "poges" of at least 16k. But larger cluster sizes will ask for much larger pages I think this means that physmem got fragmented to the point where it is no longer possible to allocate a larger phys contig pool page. This has always been a problem There is no easy solution. Very well possible that the allocation patterns in the kernel changed over time such that fragmentation is more likely. But it is unfair to blame mpi@ for that.
