> Date: Wed, 24 Sep 2025 12:19:24 +0200
> From: Martin Pieuchot <[email protected]>
Since this seems to help and we'd like to fix the landisk issue before
release, going back to the mail with the diff...
> On 22/09/25(Mon) 09:24, Miod Vallat wrote:
> > > Interesting. Could you enter ddb and report a trace of the ssh client
> > > stuck on "pmrwait"? At the same time, could you get "ps", "show uvm",
> > > "show all pools" and "show bcstats"?
> >
> > I have rebooted the machine since, but I'll do it next time.
> >
> > > Do I understand correctly that the #ifdef __sh__ doesn't make a
> > > difference for your machine?
> >
> > Yes.
> >
> > > Indeed, that said is the pdaemon currently running? `pdwoke' is quite
> > > high. So the question is does the ssh client keeps hitting the memory
> > > limit? Or is it stuck there?
> >
> > As I said, its cpu time is slowly increasing. But it did not appear to
> > make any significant progress after 8 hours...
>
> Diff below includes a fix for a missing wakeup. It might correspond to
> the bug you're seeing.
>
> Problem is that there are two sleep channels and two mechanism for OOM
> situations. The IOdone daemon was waking only one of the two.
>
> Does it help?
>
> Index: uvm/uvm_pdaemon.c
> ===================================================================
> RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
> diff -u -p -r1.137 uvm_pdaemon.c
> --- uvm/uvm_pdaemon.c 2 Jun 2025 18:49:04 -0000 1.137
> +++ uvm/uvm_pdaemon.c 24 Sep 2025 10:11:11 -0000
> @@ -303,10 +303,10 @@ uvm_pageout(void *arg)
> * wake up any waiters.
> */
> uvm_lock_fpageq();
> - if (uvmexp.free > uvmexp.reserve_kernel || uvmexp.paging == 0) {
> - wakeup(&uvmexp.free);
> - }
> -
> + /*
> + * Wake waiter on pma first because they are more likely
> + * to starve on low pages.
> + */
> if (pma != NULL) {
> /*
> * XXX If UVM_PMA_FREED isn't set, no pages
> @@ -320,8 +320,11 @@ uvm_pageout(void *arg)
> wakeup(pma);
> }
> }
> + if (uvmexp.free > uvmexp.reserve_kernel || uvmexp.paging == 0)
> + wakeup(&uvmexp.free);
> uvm_unlock_fpageq();
>
> +
I'm not sure this makes sense. On MP systems both wakeups happen
while we're holding a mutex and our mutexes aren't fair. So I don't
think there is a strong relationship between the order of those
wakeups and which waiter gets to grab the mutex first. And on landisk
we don't have low memory...
So can we shelve this bit until after release (and maybe until we've
made our mutexes a bit more fair)?
I also think that comment needs a bit more explanation. And I'd say
that extra newline should be move between the two wakeup blocks.
> /*
> * scan done. unlock page queues (the only lock we are holding)
> */
> @@ -377,8 +380,11 @@ uvm_aiodone_daemon(void *arg)
>
> uvm_lock_fpageq();
> atomic_sub_int(&uvmexp.paging, npages);
> - wakeup(uvmexp.free <= uvmexp.reserve_kernel ? &uvm.pagedaemon :
> - &uvmexp.free);
> + if (uvmexp.free <= uvmexp.reserve_kernel ||
> + !TAILQ_EMPTY(&uvm.pmr_control.allocs))
> + wakeup(&uvm.pagedaemon);
> + else
> + wakeup(&uvmexp.free);
> uvm_unlock_fpageq();
> }
> }
This bit makes sense to me. Does this bit alone fix the issue? Can
we get this tested widely?