> Date: Wed, 26 Mar 2025 01:49:18 -0400
> From: George Koehler <[email protected]>
>
> On Fri, 21 Mar 2025 20:05:08 -0700
> Eric Grosse <[email protected]> wrote:
>
> > panic: kernel diagnostic assertion "UVM_PSEG_INUSE(pseg, id)" failed: file
> > "/us
> > r/src/sys/uvm/uvm_pager.c", line 207
> > ...
> > panic+0x134
> > __assert+0x30
> > uvm_pseg_release+0x380
> > uvn_io+0x2d4
>
> Thanks for the report. I'm not sure, but I suspect that the kernel's
> pmap is corrupt. I have a diff that might prevent this corruption,
> but wastes about 500 megabytes of ram. This diff might not help you;
> your powerpc64 "stays up for weeks at a time" and I hope to have a
> better diff sooner than that.
>
> In your assertion failure, uvn_io probably called uvmpager_mapout
> which made a tail call to uvm_pseg_release.
>
> The kernel's pmap might be corrupt because the macro PMAP_VP_LOCK
> never locks the kernel's v->p table. A pair of cpus would both see
> vp1 == NULL or vp2 == NULL in pmap_vp_enter, and race to allocate
> the same struct. (The pmap_vp_pool guards itself with a mutex, but
> does not guard the check for vp1 == NULL or vp2 == NULL.) One struct
> and its mapping would get lost. Later, pmap_remove would not unmap
> the lost mapping. Then pmap_enter would enter another mapping at the
> same virtual address. The two mappings from the same virtual address
> to different physical addresses would corrupt memory.
>
> My diff allocates the kernel's v->p table at boot time, by adapting
> some code from macppc's pmap. This might prevent the race, but the
> allocation takes more than 500 megabytes of ram. (macppc, with its
> tiny 32-bit address space, has a much smaller allocation.) I might
> throw away this diff and try something else.
Yes, that strategy doesn't really work on 64-bit architectures.
> Index: arch/powerpc64/powerpc64/pmap.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/powerpc64/powerpc64/pmap.c,v
> diff -u -p -r1.64 pmap.c
> --- arch/powerpc64/powerpc64/pmap.c 28 Nov 2024 18:54:36 -0000 1.64
> +++ arch/powerpc64/powerpc64/pmap.c 25 Mar 2025 22:17:30 -0000
> @@ -885,8 +885,10 @@ pmap_remove_pted(pmap_t pm, struct pte_d
> if (PTED_MANAGED(pted))
> pmap_remove_pv(pted);
>
> - pmap_vp_remove(pm, pted->pted_va);
> - pool_put(&pmap_pted_pool, pted);
> + if (pm != pmap_kernel()) {
> + pmap_vp_remove(pm, pted->pted_va);
> + pool_put(&pmap_pted_pool, pted);
> + }
> }
>
> extern struct fdt_reg memreg[];
> @@ -1066,7 +1068,9 @@ pmap_enter(pmap_t pm, vaddr_t va, paddr_
> pted = pmap_vp_lookup(pm, va);
> if (pted && PTED_VALID(pted)) {
> pmap_remove_pted(pm, pted);
> - pted = NULL;
> + /* we lost our pted if it was user */
> + if (pm != pmap_kernel())
> + pted = NULL;
> }
>
> pm->pm_stats.resident_count++;
> @@ -1530,24 +1534,27 @@ pmap_proc_iflush(struct process *pr, vad
> }
> }
>
> -void
> +struct slb_desc *
> pmap_set_kernel_slb(vaddr_t va)
> {
> + struct slb_desc *slbd;
> uint64_t esid;
> int idx;
>
> esid = va >> ADDR_ESID_SHIFT;
>
> for (idx = 0; idx < nitems(kernel_slb_desc); idx++) {
> - if (kernel_slb_desc[idx].slbd_vsid == 0)
> + slbd = &kernel_slb_desc[idx];
> + if (slbd->slbd_vsid == 0)
> break;
> - if (kernel_slb_desc[idx].slbd_esid == esid)
> - return;
> + if (slbd->slbd_esid == esid)
> + return slbd;
> }
> KASSERT(idx < nitems(kernel_slb_desc));
>
> - kernel_slb_desc[idx].slbd_esid = esid;
> - kernel_slb_desc[idx].slbd_vsid = pmap_kernel_vsid(esid);
> + slbd->slbd_esid = esid;
> + slbd->slbd_vsid = pmap_kernel_vsid(esid);
> + return slbd;
> }
>
> /*
> @@ -1629,7 +1636,7 @@ pmap_bootstrap_cpu(void)
> void
> pmap_bootstrap(void)
> {
> - paddr_t start, end, pa;
> + paddr_t kvp, kvp_end, start, end, pa;
> vm_prot_t prot;
> vaddr_t va;
>
> @@ -1650,12 +1657,34 @@ pmap_bootstrap(void)
> memset(pmap_ptable, 0, HTABMEMSZ);
> pmap_ptab_mask = pmap_ptab_cnt - 1;
>
> + /* allocate v->p mappings for pmap_kernel() */
> + {
> + size_t sz, sz1, sz2;
> +
> + sz2 = sizeof(struct pmapvp2) +
> + VP_IDX2_CNT * sizeof(struct pte_desc);
> + sz1 = sizeof(struct pmapvp1) + VP_IDX1_CNT * sz2;
> + sz = 0;
> + for (va = VM_MIN_KERNEL_ADDRESS; va < VM_MAX_KERNEL_ADDRESS;
> + va += SEGMENT_SIZE)
> + sz += sz1;
> + kvp = (paddr_t)pmap_steal_avail(sz, PAGE_SIZE);
> + kvp_end = kvp + sz;
> + memset((void *)kvp, 0, sz);
> + }
> +
> /* Map page tables. */
> start = (paddr_t)pmap_ptable;
> end = start + HTABMEMSZ;
> for (pa = start; pa < end; pa += PAGE_SIZE)
> pmap_kenter_pa(pa, pa, PROT_READ | PROT_WRITE);
>
> + /* Map kernel v->p table. */
> + start = kvp;
> + end = kvp_end;
> + for (pa = start; pa < end; pa += PAGE_SIZE)
> + pmap_kenter_pa(pa, pa, PROT_READ | PROT_WRITE);
> +
> /* Map kernel. */
> start = (paddr_t)_start;
> end = (paddr_t)_end;
> @@ -1690,10 +1719,32 @@ pmap_bootstrap(void)
> va += SEGMENT_SIZE)
> pmap_set_kernel_slb(va);
>
> - /* SLB entries for kernel VA. */
> - for (va = VM_MIN_KERNEL_ADDRESS; va < VM_MAX_KERNEL_ADDRESS;
> + /* SLB entries for the kernel v->p table. */
> + for (va = (vaddr_t)kvp; va < (vaddr_t)kvp_end;
> va += SEGMENT_SIZE)
> pmap_set_kernel_slb(va);
> +
> + /* SLB entries for kernel VA. */
> + for (va = VM_MIN_KERNEL_ADDRESS; va < VM_MAX_KERNEL_ADDRESS;
> + va += SEGMENT_SIZE) {
> + struct slb_desc *slbd;
> + struct pmapvp1 *vp1;
> + struct pmapvp2 *vp2;
> + struct pte_desc *pted;
> + int i, k;
> +
> + slbd = pmap_set_kernel_slb(va);
> + vp1 = slbd->slbd_vp = (struct pmapvp1 *)kvp;
> + kvp += sizeof(*vp1);
> + for (i = 0; i < VP_IDX1_CNT; i++) {
> + vp2 = vp1->vp[i] = (struct pmapvp2 *)kvp;
> + kvp += sizeof(*vp2);
> + for (k = 0; k < VP_IDX2_CNT; k++) {
> + pted = vp2->vp[k] = (struct pte_desc *)kvp;
> + kvp += sizeof(*pted);
> + }
> + }
> + }
>
> pmap_bootstrap_cpu();
>
>
>