Hello Miod,
Thanks for the useful report.
On 31/12/25(Wed) 06:42, Miod Vallat wrote:
> I can still reproduce landisk stalling during a cvs update, after having
> baken a full muild and xenocara. This time with the kernel-side
> traceback of the process stuck in pmrwait.
Note that many threads are also in biowait...
> ddb> ps
> PID TID PPID UID S FLAGS WAIT COMMAND
> 22735 236842 22266 1500 3 0x100003 pmrwait ssh
> 22266 24911 8155 1500 3 0x100003 biowait cvs
> 25498 116159 90374 1500 3 0x100083 ttyin ksh
> 8155 467717 90374 1500 3 0x10008b sigsusp ksh
> 90374 138597 98462 1500 3 0x10 biowait sshd-session
> 98462 191897 15001 0 3 0x92 kqread sshd-session
> 57416 152946 1 0 3 0x100003 biowait getty
> 42265 403899 1 0 3 0x100098 kqread cron
> 99827 455058 1 0 3 0x100090 kqread inetd
> 70128 215727 4854 95 3 0x1100092 kqread smtpd
> 8281 217519 4854 103 3 0x1100092 kqread smtpd
> 76435 92448 4854 95 3 0x1100092 kqread smtpd
> 1915 373935 4854 95 3 0x100092 kqread smtpd
> 91271 501146 4854 95 3 0x1100092 kqread smtpd
> 36815 115371 4854 95 3 0x1100092 kqread smtpd
> 4854 296662 1 0 3 0x100080 kqread smtpd
> 15001 426794 1 0 3 0x88 kqread sshd
> 5586 268874 0 0 3 0x14280 nfsidl nfsio
> 86082 74082 0 0 3 0x14280 nfsidl nfsio
> 4201 354622 0 0 3 0x14280 nfsidl nfsio
> 9170 1439 0 0 3 0x14280 nfsidl nfsio
> 7090 276569 1 0 3 0 biowait ypbind
> 47046 449110 1 28 3 0x1100010 biowait portmap
> 95989 36830 1 0 3 0x100080 kqread ntpd
> 53651 165033 76253 83 3 0x100092 kqread ntpd
> 76253 352661 1 83 3 0x1100092 kqread ntpd
> 39789 280571 4078 74 3 0x1100092 bpf pflogd
> 4078 55888 1 0 3 0x80 sbwait pflogd
> 84079 106662 16372 73 3 0x1100090 kqread syslogd
> 16372 35927 1 0 3 0x100082 sbwait syslogd
> 78962 437988 31797 115 3 0x100092 kqread slaacd
> 85919 478332 31797 115 3 0x100092 kqread slaacd
> 31797 480049 1 0 3 0x100080 kqread slaacd
> 27846 437548 0 0 3 0x14200 bored smr
> 41958 344917 0 0 3 0x14200 pgzero zerothread
> 6076 129835 0 0 3 0x14200 aiodoned aiodoned
> 84866 363355 0 0 3 0x14200 syncer update
> 87134 418449 0 0 3 0x14200 cleaner cleaner
> 53321 503280 0 0 3 0x14200 reaper reaper
> 88581 91939 0 0 3 0x14200 pgdaemon pagedaemon
> 45312 521165 0 0 3 0x14200 usbtsk usbtask
> 37631 261918 0 0 3 0x14200 usbatsk usbatsk
> 18147 310057 0 0 3 0x14200 bored softnet0
> 37766 422281 0 0 3 0x14200 bored systqmp
> 83868 57728 0 0 3 0x14200 bored systq
> 62182 91648 0 0 3 0x40014200 tmoslp softclock
> * 973 455824 0 0 7 0x40014200 idle0
> 1 152093 0 0 3 0x82 wait init
> 0 0 -1 0 3 0x10200 scheduler swapper
> ddb> show uvmexp
> Current UVM status:
> pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
> 14864 VM pages: 3 active, 1 inactive, 1 wired, 10118 free (1280 zero)
> freemin=495, free-target=660, inactive-target=661, wired-max=4954
> faults=145296324, traps=74992724, intrs=26647938, ctxswitch=14048753
> fpuswitc
> h=0
> softint=12776668, syscalls=74992722, kmapent=8
> fault counts:
> noram=996715, noanon=0, noamap=0, pgwait=23949, pgrele=0
> relocks=1867838(14104), upgrades=0(0) anget(retries)=57776611(585752),
> amap
> copy=23259327
> neighbor anon/obj pg=39185700/89679236, gets(lock/unlock)=26907893/1286410
> cases: anon=47576129, anoncow=10190414, obj=23737048, prcopy=3166521,
> przer
> o=60563995
> daemon and swap counts:
> woke=1566238, revs=284861851, scans=2683067, obscans=506886,
> anscans=181199
> 8
> busy=0, freed=1421072, reactivate=363889, deactivate=8877746
> pageouts=179395, pending=144550, nswget=539823
> nswapdev=1
> swpages=4194415, swpginuse=4625, swpgonly=4621 paging=3
^^^^^^^^
The page daemon is waiting for the previous pageouts to finish. Can we
assume they are stuck somewhere? Or can you reproduce the hang and see
if `uvmexp.paging` is 0?
Diff below should allow the page daemon to make progress even if previous
pageouts did not finish. Please let me know if it helps.
However given the fact that many threads are in "biowait" and the ssh
process in "pmrwait" seems to be also blocked by the page daemon waiting
for IOs and that the bug is triggered by "cvs update", I feel like the
issue might be somewhere else.
> kernel pointers:
> objs(kern)=0x8c3cccfc
> ddb> show bcstats
> Current Buffer Cache status:
> numbufs 183 busymapped 1, delwri 0
> kvaslots 185 avail kva slots 184
> bufpages 678, dmapages 678, dirtypages 0
> pendingreads 1, pendingwrites 1
> highflips 0, highflops 0, dmaflips 0
> ddb> tr /t 0t236842
> mi_switch() at mi_switch+0x8a
> sleep_finish() at sleep_finish+0xb8
> msleep_nsec() at msleep_nsec+0xdc
> uvm_wait_pla() at uvm_wait_pla+0x90
> uvm_pmr_getpages() at uvm_pmr_getpages+0x87a
> km_alloc() at km_alloc+0x24c
> pool_multi_alloc() at pool_multi_alloc+0x7a
> m_pool_alloc() at m_pool_alloc+0x36
> pool_allocator_alloc() at pool_allocator_alloc+0x18
> pool_p_alloc() at pool_p_alloc+0x3e
> pool_do_get() at pool_do_get+0x174
> pool_get() at pool_get+0xba
> m_clget() at m_clget+0x38
> m_getuio() at m_getuio+0xa6
> sosend() at sosend+0x218
> soo_write() at soo_write+0x28
> dofilewritev() at dofilewritev+0x7e
> sys_write() at sys_write+0x40
> syscall() at syscall+0x2ca
> (EXPEVT 160; SSR=00000001) at 0x39dd9a30
Index: uvm/uvm_pdaemon.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
diff -u -p -r1.144 uvm_pdaemon.c
--- uvm/uvm_pdaemon.c 24 Dec 2025 10:29:22 -0000 1.144
+++ uvm/uvm_pdaemon.c 31 Dec 2025 09:52:08 -0000
@@ -232,7 +232,7 @@ uvm_pageout(void *arg)
long size;
uvm_lock_fpageq();
- if (TAILQ_EMPTY(&uvm.pmr_control.allocs) || uvmexp.paging > 0) {
+ if (TAILQ_EMPTY(&uvm.pmr_control.allocs)) {
msleep_nsec(&uvm.pagedaemon, &uvm.fpageqlock, PVM,
"pgdaemon", INFSLP);
uvmexp.pdwoke++;
@@ -303,7 +303,7 @@ uvm_pageout(void *arg)
* wake up any waiters.
*/
uvm_lock_fpageq();
- if (uvmexp.free > uvmexp.reserve_kernel || uvmexp.paging == 0) {
+ if (uvmexp.free > uvmexp.reserve_kernel) {
wakeup(&uvmexp.free);
}