On Fri, 29 Dec 2023 at 16:29, BERTRAND Joël <joel.bertr...@systella.fr> wrote: > > David Brownlee wrote: > > On Wed, 27 Dec 2023 at 08:25, BERTRAND Joël <joel.bertr...@systella.fr> > > wrote: > >> > >> Hello, > >> > >> Yesterday, I have changed my system disk (raid0). Thus, system has > >> rebuilt a 1 To raid1 volume and system has crashed three or four times. > >> > >> First time : > >> > >> [ 5235.028358] uvm_fault(0xffffffff8190fbc0, 0xfffff6fc5a75b000, 2) -> e > >> [ 5235.028358] fatal page fault in supervisor mode > >> [ 5235.028358] trap type 6 code 0x2 rip 0xffffffff80ea1063 cs 0x8 > >> rflags 0x10246 cr2 0xfffff6fc5a75bb98 ilevel 0 rsp 0xffffac04372f5e98 > >> [ 5235.028358] curlwp 0xfffff8b686f8a180 pid 0.17 lowest kstack > >> 0xffffac04372f12c0 > >> [ 5235.028358] panic: trap > >> [ 5235.028358] cpu2: Begin traceback... > >> [ 5235.028358] vpanic() at netbsd:vpanic+0x183 > >> [ 5235.028358] panic() at netbsd:panic+0x3c > >> [ 5235.028358] trap() at netbsd:trap+0xbaf > >> [ 5235.028358] --- trap (number 6) --- > >> [ 5235.028358] _atomic_swap_64() at netbsd:_atomic_swap_64+0x3 > >> [ 5235.028358] uvm_km_pgremove_intrsafe() at > >> netbsd:uvm_km_pgremove_intrsafe+0x6d > >> [ 5235.028358] uvm_km_kmem_free() at netbsd:uvm_km_kmem_free+0x3b > >> [ 5235.028358] gc_thread() at netbsd:gc_thread+0x7c > >> [ 5235.028358] cpu2: End traceback... > >> > >> [ 5235.038351] dumping to dev 18,1 (offset=251919, size=4162814): > >> [ 5235.038351] dump > >> > >> Of course, no crash dump was written. I have tried to remove swap > >> in a > >> first time, and system randomly enters in a lock and reboots (maybe with > >> help of watchdog). After rebuild was completed, system seems to be stable. > >> > >> Sorry, I'm unable to obtain more information. > > > > Could I ask what controller this was using, and what vintage NetBSD-10 > > (BETA or RC_1)? There was an issue with the mfi controller which > > showed up on earlier netbsd-10 BETA versions > > You can ;-) > > It's a RC_1:
OK, so pretty damn recent :-p > legendre# uname -a > NetBSD legendre.systella.fr 10.0_RC1 NetBSD 10.0_RC1 (CUSTOM) #10: Thu > Dec 21 09:51:28 CET 2023 > r...@legendre.systella.fr:/usr/src/netbsd-10/obj/sys/arch/amd64/compile/CUSTOM > amd64 > and tree was updated juste before building system. > > I have to add that after raid is successfully rebuilt, system is > stable. I don't remember last time I have rebuilt this volume. So depending on when it was originally installed (netbsd-8?) it could be possible that whatever issue was present in netbsd-9 also? > I only use raidframe on regular sata interface. Motherboard is an Asus > Z97Q (if I remember): > > legendre# lspci | grep SATA > 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA > Controller [AHCI Mode] > 08:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9120 SATA > 6Gb/s Controller (rev 12) > > Raid0 and raid1 are connected to 00:1f.2 (wd2 to wd6). OK, so definitely not mfi related... I don't have any directly useful suggestions. Depending on hardware availability and risk of downtime tolerance I'd be tempted to try to reproduce on different hardware - cloning data to another box, or temporarily moving disks across and then triggering a raid rebuild to eliminate any overall issues with the base hardware. Could the power supply be marginal - could rebuilding all disks and some other activity be pushing it a little too hard? Running a few copies of sysutils/cpuburn might flush out some issues if that was the case David