On Thu, Dec 12, 2024 at 11:30 AM Martin Pieuchot <[email protected]> wrote:
...
> That sounds like a memory corruption of some sort. It might be that
> recent changes hide it. I'd be glad if you could test George's change.
...
> Thanks. If you run into such crash again, please try to get a trace
> from the cpu that panic'd. In this case cpu0.
I'm finally back home and power cycled to get the machine back. After
compiling a new kernel with source from cvs (including George's change)
I pretty quickly got another crash:
OpenBSD/powerpc64 (t.n2vi.net) (console)
login: pdaanri c0d:axr 8k e0drxsni4es7rl d0dsixai4gs0nr0o0 0s0t0xi004c0
s0ts0re0ar0tp0i o
n t r a "pa n o n d -a t> yap ner _ 3l 0o0 c0xk s8 r
drs 1i 9 s 0 0 r0 0
0 =0 x=4 0 0 0 t0 y000p 0e 9 0 033 02 0 00 sr 0 0rat 1
9
0 0 0 0 t 0 r 0 a 0 0 0 p 0 t 0 9 0 3 2y a p t e
3 0 0a d d 3 9 8 l r b 8 d
fbN08Ue8cLc
0Ls r
r1|Stopped at _rb_remove+0x36c: ld r4,8(r3)
TID PID UID PRFLAGS PFLAGS CPU COMMAND
*453813 7858 8889 0x2000002 0 3 compile
36553 81731 8889 0x2000002 0 2 compile
99640 86457 8889 0x2000002 0 7 compile
242134 64462 8889 0x2000002 0 0 compile
68242 72226 8889 0x2000002 0x4000000 4 go
495275 4590 8889 0x2000002 0x4000000 1 compile
250285 66963 0 0x14000 0x200 6 reaper
241654 98481 0 0x14000 0x200 5 pagedaemon
_rb_remove+0x36c
uvm_pmr_get1page+0x130
uvm_pmr_getpages+0x474
uvm_pglistalloc+0x11c
km_alloc+0x364
pool_page_alloc+0x64
pool_p_alloc+0x94
pool_do_get+0x298
pool_get+0xcc
--db_more-- q
amap_alloc1+0x120
amap_alloc+0x4c
amap_copy+0x3fc
uvm_fault_check+0x2cc
uvm_fault+0x118
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb{3}> show panic
*cpu1: kernel diagnostic assertion "anon->an_lock == NULL ||
rw_write_held(anon->an_lock)" failed: file "/sys/uvm/uvm_anon.c", line
85
ddb{3}> mach ddbcpu 1
Stopped at cpu_intr+0x50: ori r0,r0,0x0
cpu_intr+0x50
xive_hvi+0x1b8
hvi_intr+0x38
trap+0xd4
trapagain+0x4
--- trap (type 0xea0) ---
opal_call+0x50
opal_cnputc+0x8c
cnputc+0x64
db_putchar+0x3b0
kputchar+0x1fc
kprintf+0xd18
db_printf+0x78
panic+0xb8
__assert+0x30
ddb{1}> show registers
r0 0x75c444 xive_hvi+0x1bc
r1 0xc0000001601da158
r2 0x1054000 .TOC.
r3 0x1
r4 0
r5 0x80000000
r6 0
r7 0x31c60060
r8 0
r9 0x31c60060
r10 0x31c60060
r11 0x75c444 xive_hvi+0x1bc
r12 0xae8524 cpu_intr
r13 0x4af56a908
r14 0
r15 0x3b
r16 0x30
r17 0
r18 0
r19 0xc0000001601da8d0
r20 0
r21 0xffffffffffffff81
r22 0x1
r23 0xc00000003e4c8700
r24 0xc00000013acd9000
r25 0xc00000003e3e1080
r26 0xc00000003e3e1060
r27 0
r28 0x10b2c70 cpu_info+0xf08
r29 0xc00000003e3e1000
r30 0x1
r31 0x9000000000009032
lr 0xae8574 cpu_intr+0x50
cr 0x40009032
xer 0x20040000
ctr 0xae8524 cpu_intr
iar 0xae8574 cpu_intr+0x50
msr 0x9000000000029032
dar 0xc001e65f80
dsisr 0x42000000
cpu_intr+0x50: ori r0,r0,0x0
ddb{1}> show proc
PROC (compile) tid=495275 pid=4590 tcnt=8 stat=onproc
flags process=2000002 proc=4000000
runpri=82, usrpri=82, slppri=32, nice=20
wchan=0x0, wmesg=, ps_single=0x0 scnt=0 ecnt=0
forw=0xffffffffffffffff, list=0xc00000003db20020,0xc00000003db55c68
process=0xc00000014c479048 user=0xc0000001601d6000,
vmspace=0xc000000144b82178
estcpu=32, cpticks=1, pctcpu=0.0, user=1, sys=0, intr=0
ddb{1}> show all locks
No such command
ddb{1}> show witness
No such command
ddb{1}> show locks
No such command
ddb{1}> show ?
Bad character
all
bcstats
breaks
buf
extents
malloc
map
mbuf
mount
nfsreq
nfsnode
object
page
panic
pool
proc
registers
route
socket
struct
swap
tdb
uvmexp
vnode
watches
ddb{1}> show all ?
Bad character
procs
callout
clockintr
pools
mounts
vnodes
bufs
routes
nfsreqs
nfsnodes
tdbs
ddb{1}> mach ddbcpu 3
Stopped at _rb_remove+0x36c: ld r4,8(r3)
_rb_remove+0x36c
uvm_pmr_get1page+0x130
uvm_pmr_getpages+0x474
uvm_pglistalloc+0x11c
km_alloc+0x364
pool_page_alloc+0x64
pool_p_alloc+0x94
pool_do_get+0x298
pool_get+0xcc
amap_alloc1+0x120
amap_alloc+0x4c
amap_copy+0x3fc
uvm_fault_check+0x2cc
uvm_fault+0x118
ddb{3}> show all locks
No such command
ddb{3}> show bcstats
Current Buffer Cache status:
numbufs 129161 busymapped 0, delwri 1845
kvaslots 52428 avail kva slots 52428
bufpages 931802, dmapages 931802, dirtypages 14752
pendingreads 41, pendingwrites 21
highflips 0, highflops 0, dmaflips 0
ddb{3}> show page
PAGE 0xf70882:
flags=72696340, vers=1949199922, wire_count=1986604654, pa=0x7379732f61726368
uobject=0x3620555443203230, uanon=0x2030323a31383a35,
offset=0x32340a2020202065
[page ownership tracking disabled] vm_page_md 0xf708ea
ddb{3}> show uvmexp
Current UVM status:
pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
7890068 VM pages: 220193 active, 623328 inactive, 1 wired, 6023076
free (726057 zero)
freemin=263002, free-target=350669, inactive-target=350670, wired-max=2630022
faults=646752174, traps=823614522, intrs=225630053,
ctxswitch=257158930 fpuswitch=0
softint=84868373, syscalls=493843057, kmapent=31
fault counts:
noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0
ok relocks(total)=3874988(3937104), anget(retries)=202347794(0),
amapcopy=44517770
neighbor anon/obj pg=14281629/248150553, gets(lock/unlock)=87584800/3937305
cases: anon=191542501, anoncow=10805293, obj=75240848,
prcopy=12281635, przero=356881834
daemon and swap counts:
woke=40, revs=0, scans=0, obscans=0, anscans=0
busy=0, freed=0, reactivate=0, deactivate=0
pageouts=0, pending=0, nswget=0
nswapdev=1
swpages=8454143, swpginuse=0, swpgonly=0 paging=0
kernel pointers:
objs(kern)=0x106cd70
ddb{3}> show registers
r0 0xb8df08 uvm_pmr_get1page+0x134
r1 0xc00000013e7841d0
r2 0x1054000 .TOC.
r3 0
r4 0xc000000014684e98
r5 0xc000000014684ea0
r6 0xc0000000148c5608
r7 0
r8 0
r9 0x9000000000001032
r10 0x1032900000000000
r11 0xb8df08 uvm_pmr_get1page+0x134
r12 0
r13 0x4af56bb08
r14 0
r15 0xffffffffffffffff
r16 0
r17 0xffffffffffffffff
r18 0
r19 0xc00000013e7845c0
r20 0
r21 0xc000000014683c00
r22 0xfd4ce0 uvm_pmr_addr_RBT_INFO
r23 0xc000000014683c10
r24 0x1
r25 0
r26 0xc000000014683a10
r27 0xc000000014684490
r28 0xc000000014683c10
r29 0xc0000000000a1000
r30 0xfd4ce0 uvm_pmr_addr_RBT_INFO
r31 0
lr 0xb8df08 uvm_pmr_get1page+0x134
cr 0x22222032
xer 0x20040000
ctr 0xb9bb0c generic_space_write_1
iar 0xadd398 _rb_remove+0x36c
msr 0x9000000000009032
dar 0x8
dsisr 0x40000000
_rb_remove+0x36c: ld r4,8(r3)
(I tried a few commands that I'd seen in past emails that seemed
relevant but I gather are now obsolete.
I'll leave the machine at that point in ddb for a while in case there
is something else I should print.)