On 12/09/25(Fri) 18:45, Alexander Bluhm wrote:
> On Thu, Sep 11, 2025 at 06:31:42PM +0200, Martin Pieuchot wrote:
> > On 08/09/25(Mon) 18:53, Martin Pieuchot wrote:
> > > On 29/08/25(Fri) 19:12, Alexander Bluhm wrote:
> > > > Hi,
> > > >
> > > > One of my i386 test machines crashed during make build. Kernel is
> > > > GENERIC.MP built from current sources.
> > > >
> > > > panic: uvm_fault(0xd59b2424, 0xcf800000, 0, 1) -> e
> > > > Stopped at db_enter+0x4: popl %ebp
> > > > TID PID UID PRFLAGS PFLAGS CPU COMMAND
> > > > *435945 40779 21 0x2 0 5 llvm-tblgen
> > > > 212835 13904 21 0x2 0 3 llvm-tblgen
> > > > 150086 44694 21 0x2 0 11 llvm-tblgen
> > > > 332575 10539 21 0x2 0 10 llvm-tblgen
> > > > 385473 77182 21 0x2 0 8 llvm-tblgen
> > > > 104320 19436 0 0x14000 0x200 1 aiodoned
> > > > db_enter() at db_enter+0x4
> > > > panic(d0cbc6b7) at panic+0x7a
> > > > kpageflttrap(f86b5c94,cf800000) at kpageflttrap+0x133
> > > > trap(f86b5c94) at trap+0x255
> > > > calltrap() at calltrap+0xc
> > > > pmap_remove_ptes_pae(d0f6fda0,0,cf800000,0,1000,0,f86b5d1c) at
> > > > pmap_remove_ptes_pae+0x4f
> > > > pmap_do_remove_pae(d0f6fda0,0,1000,0) at pmap_do_remove_pae+0x120
> > > > pmap_remove(d0f6fda0,0,1000) at pmap_remove+0x18
> > > > uvm_pagermapout(0,1) at uvm_pagermapout+0x1a
> > >
> > > This is very wrong. That means `kva' is 0. The only way this can
> > > happen is if pmap_enter(9) failed in uvm_pagermapin().
> > >
> > > Using pmap_kenter_pa(9) would not only prevent this issue, it would also
> > > speed up memory recovery. Sadly we had to revert such change because on
> > > Landisk it doesn't handle conflicting cache aliases like pmap_enter(9).
> >
> > Here's a diff that fixes the bug and does not make landisk slow other
> > architectures. This gives a noticeable boost for page faults and
> > aggressive swapping (like a stress test with torture).
> >
> > Note that NetBSD also calls pmap_kenter_pa(9) in this case. So maybe
> > there's a fix for landisk out there. Anyone care about landisk?
> >
> > Alexander would you please test this on your i386?
>
> I did make build, release and regress on the affected machines. No
> more crashes seen, but the crashes did not happen reliably before.
Alexander, could you try the diff below?
Index: uvm_pager.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_pager.c,v
diff -u -p -r1.94 uvm_pager.c
--- uvm_pager.c 10 Mar 2025 14:13:58 -0000 1.94
+++ uvm_pager.c 6 Oct 2025 08:13:07 -0000
@@ -263,13 +263,16 @@ uvm_pagermapin(struct vm_page **pps, int
pp = *pps++;
KASSERT(pp);
KASSERT(pp->pg_flags & PG_BUSY);
- /* Allow pmap_enter to fail. */
- if (pmap_enter(pmap_kernel(), cva, VM_PAGE_TO_PHYS(pp),
+ while (pmap_enter(pmap_kernel(), cva, VM_PAGE_TO_PHYS(pp),
prot, PMAP_WIRED | PMAP_CANFAIL | prot) != 0) {
- pmap_remove(pmap_kernel(), kva, cva);
- pmap_update(pmap_kernel());
- uvm_pseg_release(kva);
- return 0;
+ if (flags & UVMPAGER_MAPIN_WAITOK)
+ uvm_wait("pgrmapin");
+ else {
+ pmap_remove(pmap_kernel(), kva, cva);
+ pmap_update(pmap_kernel());
+ uvm_pseg_release(kva);
+ return 0;
+ }
}
}
pmap_update(pmap_kernel());