On Sat, May 25, 2024 at 12:35:16AM -0400, George Koehler wrote:
> On Tue, 21 May 2024 03:08:49 +0200
> Jeremie Courreges-Anglas <[email protected]> wrote:
>
> > On Tue, May 21, 2024 at 02:51:39AM +0200, Jeremie Courreges-Anglas wrote:
> > > This doesn't look powerpc64-specific. It feels like
> > > uvm_km_kmemalloc_pla() should call pmap_enter() with PMAP_CANFAIL and
> > > unwind in case of a resource shortage.
> >
> > The diff below behaves when I inject fake pmap_enter() failures on
> > amd64. It would be nice to test it on -stable and/or -current,
> > depending on whether it happens on -stable only or also on -current.
>
> I believe that we have a powerpc64-specific problem, by which
> pmap_enter of kernel memory fails on powerpc64 when it succeeds on
> other platforms.
>
> powerpc64-1.ports.openbsd.org is a 16-core POWER9 where I run dpb(1)
> to build packages. In December 2022, it got this panic,
>
> ddb{13}> show panic
> cpu0: pmemrange allocation error: allocated 0 pages in 0 segments, but
> request
> was 1 pages in 1 segments
> cpu12: kernel diagnostic assertion "*start_ptr == uvm_map_entrybyaddr(atree,
> a
> ddr)" failed: file "/usr/src/sys/uvm/uvm_map.c", line 594
> *cpu13: pmap_enter: failed to allocate pted
>
> A panic on some cpu can cause extra panics other cpus, because some
> events happen out of order:
> - The first cpu sends an IPI to each other cpu to go into ddb,
> before it disables the locks.
> - Some other cpu sees the locks being disabled, before it receives
> the IPI to go into ddb. The cpu skips acquiring some lock and
> trips on corrupt memory, perhaps by failing an assertion, or by
> dereferencing a poisoned pointer (powerpc64 trap type 300).
ack, thanks for making this clearer.
> I type "show panic" and try to find the original panic and ignore the
> extra panics.
>
> The same 16-core POWER9, in May 2023, got this panic,
>
> ddb{11}> show panic
> *cpu11: pmap_enter: failed to allocate pted
> ddb{11}> trace
> panic+0x134
> pmap_enter+0x20c
> uvm_km_kmemalloc_pla+0x1f8
> uvm_uarea_alloc+0x70
> fork1+0x23c
> syscall+0x380
> trap+0x5dc
> trapagain+0x4
> --- syscall (number 2) ---
> End of kernel: 0xbffff434aa7bac60 lr 0xd165eb228594
> ddb{11}> show struct uvm_km_pages uvm_km_pages
> struct uvm_km_pages at 0x1c171b8 (65592 bytes) {mtx = {mtx_owner =
> (volatile void *)0x0, mtx_wantipl = 0x7, mtx_oldipl = 0x0}, lowat =
> 0x200, hiwat = 0x2000, free = 0x0, page = 13835058060646207488,
> freelist = (struct uvm_km_free_page *)0x0, freelistlen = 0x0, km_proc
> = (struct proc *)0xc00000011426eb00}
>
> My habit was "show struct uvm_km_pages uvm_km_pages", because these
> panics always have uvm_km_pages.free == 0, which causes
> pool_get(&pmap_pted_pool, _) to fail and return NULL, which causes
> pmap_enter to panic "failed to allocate pted".
>
> It would not fail if uvm_km_thread can run and add more free pages to
> uvm_km_pages. I would want uvm_km_kmemalloc_pla to sleep (so
> uvm_km_thread can run), but maybe I can't sleep during uvm_uarea_alloc
> in the middle of a fork.
IIUC uvm_uarea_alloc() calls uvm_km_kmemalloc_pla() without
UVM_KMF_NOWAIT/UVM_KMF_TRYLOCK, it should be ok with another potential
sleeping point. But pmap_enter() doesn't accept a flag to accept
sleeping.
> (We have uvm_km_pages only if the platform
> has no direct map: powerpc64 has uvm_km_pages, amd64 doesn't.)
>
> In platforms other than powerpc64, pmap_enter(pmap_kernel(), _) does
> not allocate. For example, macppc's powerpc/pmap.c allocates every
> kernel pted at boot.
Maybe this is a better approach. No idea if it was a deliberate
choice though.
> My 4-core POWER9 at home never reproduced this panic, perhaps because
> 4 cores are too few to take free pages out of uvm_km_pages faster than
> uvm_km_thread can add them. The 16-core POWER9 has not reproduced
> "failed to allocate pted" in recent months.
>
> --gkoehler
>
--
jca