On Wed, Oct 20 2021, Jeremie Courreges-Anglas <j...@wxcvbn.org> wrote: > On Fri, Oct 08 2021, Mark Kettenis <mark.kette...@xs4all.nl> wrote: >>> From: Jeremie Courreges-Anglas <j...@wxcvbn.org> >>> Date: Fri, 08 Oct 2021 18:19:47 +0200 >>> >>> riscv64.ports was running dpb(1) with two other members in the build >>> cluster. A few minutes ago I found it in ddb(4). The report is short, >>> sadly, as the machine doesn't return from the 'bt' command. >>> >>> The machine is acting both as an NFS server and and NFS client. >>> >>> OpenBSD/riscv64 (riscv64.ports.openbsd.org) (console) >>> >>> login: panic: pool_anic:t: pol_ free l: p mod fiee liat m oxifief:c a2e >>> 07ff0ff fte21ade0 00f ifem c0d >>> 1 07f1f0ffcf2177 010=0 c16ce6 7x090xc52c ! >>> 0x9066d21 919 xc1521 >>> Stopped at panic+0xfe: addi a0,zero,256 TID PID UID >>> PR >>> FLAGS PFLAGS CPU COMMAND >>> 24243 43192 55 0x2 0 0 cc >>> *480349 52543 0 0x11 0 1 perl >>> 480803 72746 55 0x2 0 3 c++ >>> 366351 3003 55 0x2 0 2K c++ >>> panic() at panic+0xfa >>> panic() at pool_do_get+0x29a >>> pool_do_get() at pool_get+0x76 >>> pool_get() at pmap_enter+0x128 >>> pmap_enter() at uvm_fault_upper+0x1c2 >>> uvm_fault_upper() at uvm_fault+0xb2 >>> uvm_fault() at do_trap_user+0x120 >>> https://www.openbsd.org/ddb.html describes the minimum info required in bug >>> reports. Insufficient info makes it difficult to find and fix bugs. >>> ddb{1}> bt >>> panic() at panic+0xfa >>> panic() at pool_do_get+0x29a >>> pool_do_get() at pool_get+0x76 >>> pool_get() at pmap_enter+0x128 >>> pmap_enter() at uvm_fault_upper+0x1c2 >>> uvm_fault_upper() at uvm_fault+0xb2 >>> uvm_fault() at do_trap_user+0x120 >>> do_trap_user() at cpu_exception_handler_user+0x7a >>> <hangs> >>> >>> The conserver logs for this console provide a hint about when it >>> happened: >>> >>> --8<-- >>> [-- MARK -- Fri Oct 8 08:00:00 2021] >>> [-- MARK -- Fri Oct 8 09:00:00 2021] >>> [-- MARK -- Fri Oct 8 10:00:00 2021] >>> bt >>> ^Mpanic() at panic+0xfa >>> ^Mpanic() at pool_do_get+0x29a >>> ... >>> -->8-- >>> >>> It seems that Theo was plugging/unplugging usb cables at that time. >>> I asked Theo to reboot the machine as I couldn't get more useful output. >> >> Thanks for the heads up. Some sort of memory corruption, but no real >> clues what caused it. > > Another one, maybe a similar cause, maybe not. :-/ > > [...] > OpenBSD 7.0-current (GENERIC.MP) #84: Mon Oct 18 01:23:24 MDT 2021 > dera...@riscv64.openbsd.org:/usr/src/sys/arch/riscv64/compile/GENERIC.MP > [...] > OpenBSD/riscv64 (riscv64.ports.openbsd.org) (console) > > login: t[0] == 0x0000000000000000 > t[1] == 0xffffffc00034feb2 > t[2] == 0xffffffc227cd9630 > t[3] == 0xffffffc0008cf1b0 > t[4] == 0x0000000000000022 > t[5] == 0x0000000000000000 > t[6] == 0x000000007c9bd777 > s[0] == 0xffffffc227cd9680 > s[1] == 0xffffffc2229d8d28 > s[2] == 0xffffffc000a5e9a8 > s[3] == 0xffffffc2229ad6a0 > s[4] == 0xffffffc0008ff1f8 > s[5] == 0x0000000000000000 > s[6] == 0xffffffc22a21a050 > s[7] == 0x0000000000000000 > s[8] == 0xffffffc000a20d30 > s[9] == 0xffffffc000a25718 > s[10] == 0xffffffc000a8f6a0 > s[11] == 0xffffffc000a25714 > a[0] == 0x95b8040228044314 > a[1] == 0x95b8040228044314 > a[2] == 0xffffffc2229d8d28 > a[3] == 0x0000000000000001 > a[4] == 0xffffffc023027800 > a[5] == 0x0000000000000000 > a[6] == 0x0000000000000003 > a[7] == 0xffffffc0008cf1a0 > sepc == 0xffffffc0002f043e > sstatus == 0x0000000200000120 > stval == 0xffffff822804431c > scause == 0x000000000000000d > panic: Fatal page fault at 0xffffffc0002f043e: 0xffffff822804431c > Stopped at panic+0xfe: addi a0,zero,256 TID PID UID > PR > FLAGS PFLAGS CPU COMMAND > 469890 40238 55 0x100002 0 2 sh > *380568 80457 55 0x100002 0 3K touch > 299235 73888 55 0x2 0 1 perl > 15710 89701 56 0x100002 0 0 ftp > panic() at panic+0xfa > panic() at do_trap_supervisor+0x232 > dump_regs() at cpu_exception_handler_supervisor+0x78 > cpu_exception_handler_supervisor() at pool_put+0x30 > pool_put() at ffs_reclaim+0x5c > ffs_reclaim() at VOP_RECLAIM+0x32 > VOP_RECLAIM() at vclean+0x122 > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb{3}>
The machine is waiting in ddb(4). In the past panics, I didn't get control back after typing a ddb command, better choose wisely. ;) -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE