On Thu, Sep 11, 2025 at 06:31:42PM +0200, Martin Pieuchot wrote:
> On 08/09/25(Mon) 18:53, Martin Pieuchot wrote:
> > On 29/08/25(Fri) 19:12, Alexander Bluhm wrote:
> > > Hi,
> > >
> > > One of my i386 test machines crashed during make build. Kernel is
> > > GENERIC.MP built from current sources.
> > >
> > > panic: uvm_fault(0xd59b2424, 0xcf800000, 0, 1) -> e
> > > Stopped at db_enter+0x4: popl %ebp
> > > TID PID UID PRFLAGS PFLAGS CPU COMMAND
> > > *435945 40779 21 0x2 0 5 llvm-tblgen
> > > 212835 13904 21 0x2 0 3 llvm-tblgen
> > > 150086 44694 21 0x2 0 11 llvm-tblgen
> > > 332575 10539 21 0x2 0 10 llvm-tblgen
> > > 385473 77182 21 0x2 0 8 llvm-tblgen
> > > 104320 19436 0 0x14000 0x200 1 aiodoned
> > > db_enter() at db_enter+0x4
> > > panic(d0cbc6b7) at panic+0x7a
> > > kpageflttrap(f86b5c94,cf800000) at kpageflttrap+0x133
> > > trap(f86b5c94) at trap+0x255
> > > calltrap() at calltrap+0xc
> > > pmap_remove_ptes_pae(d0f6fda0,0,cf800000,0,1000,0,f86b5d1c) at
> > > pmap_remove_ptes_pae+0x4f
> > > pmap_do_remove_pae(d0f6fda0,0,1000,0) at pmap_do_remove_pae+0x120
> > > pmap_remove(d0f6fda0,0,1000) at pmap_remove+0x18
> > > uvm_pagermapout(0,1) at uvm_pagermapout+0x1a
> >
> > This is very wrong. That means `kva' is 0. The only way this can
> > happen is if pmap_enter(9) failed in uvm_pagermapin().
> >
> > Using pmap_kenter_pa(9) would not only prevent this issue, it would also
> > speed up memory recovery. Sadly we had to revert such change because on
> > Landisk it doesn't handle conflicting cache aliases like pmap_enter(9).
>
> Here's a diff that fixes the bug and does not make landisk slow other
> architectures. This gives a noticeable boost for page faults and
> aggressive swapping (like a stress test with torture).
>
> Note that NetBSD also calls pmap_kenter_pa(9) in this case. So maybe
> there's a fix for landisk out there. Anyone care about landisk?
>
> Alexander would you please test this on your i386?
I did make build, release and regress on the affected machines. No
more crashes seen, but the crashes did not happen reliably before.
bluhm
> Index: uvm/uvm_pager.c
> ===================================================================
> RCS file: /cvs/src/sys/uvm/uvm_pager.c,v
> diff -u -p -r1.94 uvm_pager.c
> --- uvm/uvm_pager.c 10 Mar 2025 14:13:58 -0000 1.94
> +++ uvm/uvm_pager.c 11 Sep 2025 16:06:56 -0000
> @@ -263,14 +263,26 @@ uvm_pagermapin(struct vm_page **pps, int
> pp = *pps++;
> KASSERT(pp);
> KASSERT(pp->pg_flags & PG_BUSY);
> - /* Allow pmap_enter to fail. */
> - if (pmap_enter(pmap_kernel(), cva, VM_PAGE_TO_PHYS(pp),
> +#if defined(__sh__)
> + /*
> + * XXX Landisk cannot use pmap_kenter_pa(9). It has issues
> + * with virtual cache aliases when multiple mappings exist
> + * for a given managed page.
> + */
> + while (pmap_enter(pmap_kernel(), cva, VM_PAGE_TO_PHYS(pp),
> prot, PMAP_WIRED | PMAP_CANFAIL | prot) != 0) {
> - pmap_remove(pmap_kernel(), kva, cva);
> - pmap_update(pmap_kernel());
> - uvm_pseg_release(kva);
> - return 0;
> + if (flags & UVMPAGER_MAPIN_WAITOK)
> + uvm_wait("pgrmapin");
> + else {
> + pmap_remove(pmap_kernel(), kva, cva);
> + pmap_update(pmap_kernel());
> + uvm_pseg_release(kva);
> + return 0;
> + }
> }
> +#else
> + pmap_kenter_pa(cva, VM_PAGE_TO_PHYS(pp), prot);
> +#endif
> }
> pmap_update(pmap_kernel());
> return kva;
> @@ -295,10 +307,14 @@ uvm_pagermapout(vaddr_t kva, int npages)
> }
> #endif
>
> +#if defined(__sh__)
> + /* XXX see comment above. */
> pmap_remove(pmap_kernel(), kva, kva + ((vsize_t)npages << PAGE_SHIFT));
> +#else
> + pmap_kremove(kva, (vsize_t)npages << PAGE_SHIFT);
> +#endif
> pmap_update(pmap_kernel());
> uvm_pseg_release(kva);
> -
> }
>
> /*
>