Reverting external/cddl/osnet/dist/uts/common/fs/zfs/vdev_disk.c to 1.16 resolved the panic. I don't know if there is a link with the change to src/external/cddl/osnet/dist/uts/common/fs/zfs/zio.c, the build was done with the reverted to 1.6 one.
On Wed, 24 Jun 2020 at 13:20, Chavdar Ivanov <[email protected]> wrote: > > On Wed, 24 Jun 2020 at 14:12, Jaromír Doleček <[email protected]> > wrote: > > > > OK nvm, mlelsv@ claims it's the unrelated change to vdev_disk.c - so > > perhaps try with that backed off, i.e. rev. 1.16 of: > > > > external/cddl/osnet/dist/uts/common/fs/zfs/vdev_disk.c > > > > Le mer. 24 juin 2020 à 14:55, Jaromír Doleček > > <[email protected]> a écrit : > > > > > > Can you please check if it still panics the same way if you revert > > > sources to rev. 1.6 file: > > > src/external/cddl/osnet/dist/uts/common/fs/zfs/zio.c > > Backing the change in this one didn't sort the problem, identical > panic on 'zpool import' > > I'll try to back vdev_disk.c now. > > > > > > > and rebuild the zfs module? > > > > > > Jaromir > > > > > > > > > Le mer. 24 juin 2020 à 14:39, Jaromír Doleček > > > <[email protected]> a écrit : > > > > > > > > What is 'the test' ? Just modload zfs? > > > > > > > > Jaromir > > > > > > > > Le mer. 24 juin 2020 à 14:05, Chavdar Ivanov <[email protected]> a écrit > > > > : > > > > > > > > > > Yes, I do. Should I be looking for something specific? > > > > > > > > > > I've uploaded it here, if it is of interest - > > > > > https://send.firefox.com/download/74761aa43c6c54b3/#PbaGxtDN81Hzk2VUjefozw > > > > > . > > > > > > > > > > BTW I repeated the test on a pvh guest of XCP-NG, it panics the same > > > > > way. > > > > > > > > > > Chavdar > > > > > > > > > > On Wed, 24 Jun 2020 at 12:07, Jaromír Doleček > > > > > <[email protected]> wrote: > > > > > > > > > > > > By chance, do you have the kernel crash dump from the original panic > > > > > > which happened yesterday? The subsequent ones might be a result of > > > > > > the > > > > > > first one. > > > > > > > > > > > > The messages about redzone don't mean anything beyond that there is > > > > > > no > > > > > > overflow protection for items on the pool. > > > > > > > > > > > > Jaromir > > > > > > > > > > > > Le mer. 24 juin 2020 à 11:34, Chavdar Ivanov <[email protected]> a > > > > > > écrit : > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > On > > > > > > > > > > > > > > NetBSD ymir 9.99.68 NetBSD 9.99.68 (GENERIC) #1: Tue Jun 23 > > > > > > > 22:53:46 > > > > > > > BST 2020 > > > > > > > sysbuild@ymir:/home/sysbuild/amd64/obj/home/sysbuild/src/sys/arch/amd64/compile/GENERIC > > > > > > > amd64 > > > > > > > > > > > > > > I suddenly got a panic with ZFS; it took place with the previous > > > > > > > kernel, so it was something with the module. In single user I > > > > > > > disabled > > > > > > > zfs in /etc/rc.conf and was able to complete boot, but obviously > > > > > > > without my two pools. > > > > > > > > > > > > > > 'modload solaris' didn't show any problem. > > > > > > > > > > > > > > I set aside the contents of /etc/zfs and did 'modload zfs', which > > > > > > > resulted in: > > > > > > > > > > > > > > ..... > > > > > > > > > > > > > > WARNING: ZFS on NetBSD is under development > > > > > > > pool redzone disabled for 'zio_buf_4096' > > > > > > > pool redzone disabled for 'zio_data_buf_4096' > > > > > > > pool redzone disabled for 'zio_buf_8192' > > > > > > > pool redzone disabled for 'zio_data_buf_8192' > > > > > > > pool redzone disabled for 'zio_buf_16384' > > > > > > > pool redzone disabled for 'zio_data_buf_16384' > > > > > > > pool redzone disabled for 'zio_buf_32768' > > > > > > > pool redzone disabled for 'zio_data_buf_32768' > > > > > > > pool redzone disabled for 'zio_buf_65536' > > > > > > > pool redzone disabled for 'zio_data_buf_65536' > > > > > > > pool redzone disabled for 'zio_buf_131072' > > > > > > > pool redzone disabled for 'zio_data_buf_131072' > > > > > > > pool redzone disabled for 'zio_buf_262144' > > > > > > > pool redzone disabled for 'zio_data_buf_262144' > > > > > > > pool redzone disabled for 'zio_buf_524288' > > > > > > > pool redzone disabled for 'zio_data_buf_524288' > > > > > > > pool redzone disabled for 'zio_buf_1048576' > > > > > > > pool redzone disabled for 'zio_data_buf_1048576' > > > > > > > pool redzone disabled for 'zio_buf_2097152' > > > > > > > pool redzone disabled for 'zio_data_buf_2097152' > > > > > > > pool redzone disabled for 'zio_buf_4194304' > > > > > > > pool redzone disabled for 'zio_data_buf_4194304' > > > > > > > pool redzone disabled for 'zio_buf_8388608' > > > > > > > pool redzone disabled for 'zio_data_buf_8388608' > > > > > > > pool redzone disabled for 'zio_buf_16777216' > > > > > > > pool redzone disabled for 'zio_data_buf_16777216' > > > > > > > > > > > > > > I have no idea what that means, it is a first for me, ZFS > > > > > > > otherwise > > > > > > > has been very reliable on this hardware so far, inasmuch as I > > > > > > > have the > > > > > > > mercurial repo on a zfs and build from it from time to time (the > > > > > > > panic > > > > > > > is from the last cvs update from yesterday, though). > > > > > > > > > > > > > > Subsequent 'zpool import' repeated the panic (without getting me > > > > > > > into > > > > > > > the debugger, though): > > > > > > > > > > > > > > > > > > > > > ZFS filesystem version: 5 > > > > > > > uvm_fault(0xffffa97e4c3e1610, 0x0, 1) -> e > > > > > > > fatal page fault in supervisor mode > > > > > > > trap type 6 code 0 rip 0xffffffff81d49882 cs 0x8 rflags 0x10286 > > > > > > > cr2 > > > > > > > 0xa0 ilevel 0 rsp 0xffffde819c16d760 > > > > > > > curlwp 0xffffa97e3a41e140 pid 17394.17394 lowest kstack > > > > > > > 0xffffde819c16a2c0 > > > > > > > panic: trap > > > > > > > cpu0: Begin traceback... > > > > > > > vpanic() at netbsd:vpanic+0x152 > > > > > > > snprintf() at netbsd:snprintf > > > > > > > startlwp() at netbsd:startlwp > > > > > > > alltraps() at netbsd:alltraps+0xc3 > > > > > > > vdev_open() at zfs:vdev_open+0x9e > > > > > > > vdev_open_children() at zfs:vdev_open_children+0x39 > > > > > > > vdev_root_open() at zfs:vdev_root_open+0x33 > > > > > > > vdev_open() at zfs:vdev_open+0x9e > > > > > > > spa_load() at zfs:spa_load+0x38e > > > > > > > spa_tryimport() at zfs:spa_tryimport+0x86 > > > > > > > zfs_ioc_pool_tryimport() at zfs:zfs_ioc_pool_tryimport+0x41 > > > > > > > zfsdev_ioctl() at zfs:zfsdev_ioctl+0x8c1 > > > > > > > nb_zfsdev_ioctl() at zfs:nb_zfsdev_ioctl+0x38 > > > > > > > VOP_IOCTL() at netbsd:VOP_IOCTL+0x44 > > > > > > > vn_ioctl() at netbsd:vn_ioctl+0xa5 > > > > > > > sys_ioctl() at netbsd:sys_ioctl+0x550 > > > > > > > syscall() at netbsd:syscall+0x26e > > > > > > > --- syscall (number 54) --- > > > > > > > netbsd:syscall+0x26e: > > > > > > > cpu0: End traceback... > > > > > > > > > > > > > > The above panic did not leave a crash dump. > > > > > > > > > > > > > > When I had /etc/zfs populated before, I also got a crash dump > > > > > > > (with > > > > > > > 'reboot 0x104'), as follows: > > > > > > > > > > > > > > # crash -M netbsd.18.core -N netbsd.18 > > > > > > > Crash version 9.99.68, image version 9.99.68. > > > > > > > crash: _kvm_kvatop(0) > > > > > > > Kernel compiled without options LOCKDEBUG. > > > > > > > System panicked: reboot forced via kernel debugger > > > > > > > Backtrace from time of crash is available. > > > > > > > crash> bt > > > > > > > _KERNEL_OPT_NARCNET() at 0 > > > > > > > _KERNEL_OPT_NARCNET() at 0 > > > > > > > sys_reboot() at sys_reboot > > > > > > > db_fncall() at db_fncall > > > > > > > db_command() at db_command+0x127 > > > > > > > db_command_loop() at db_command_loop+0xa6 > > > > > > > db_trap() at db_trap+0xe6 > > > > > > > kdb_trap() at kdb_trap+0xe1 > > > > > > > trap() at trap+0x2b7 > > > > > > > --- trap (number 6) --- > > > > > > > vdev_disk_open.part.4() at vdev_disk_open.part.4+0x49a > > > > > > > vdev_open() at vdev_open+0x9e > > > > > > > vdev_open_children() at vdev_open_children+0x39 > > > > > > > vdev_root_open() at vdev_root_open+0x33 > > > > > > > vdev_open() at vdev_open+0x9e > > > > > > > spa_load() at spa_load+0x38e > > > > > > > spa_load_best() at spa_load_best+0x58 > > > > > > > spa_open_common() at spa_open_common+0xc2 > > > > > > > pool_status_check.part.25() at pool_status_check.part.25+0x1e > > > > > > > zfsdev_ioctl() at zfsdev_ioctl+0x80e > > > > > > > nb_zfsdev_ioctl() at nb_zfsdev_ioctl+0x38 > > > > > > > VOP_IOCTL() at VOP_IOCTL+0x44 > > > > > > > vn_ioctl() at vn_ioctl+0xa5 > > > > > > > sys_ioctl() at sys_ioctl+0x550 > > > > > > > syscall() at syscall+0x26e > > > > > > > --- syscall (number 54) --- > > > > > > > syscall+0x26e: > > > > > > > ..... > > > > > > > > > > > > > > Any idea what is going on? I've restarted a build, but the cvs log > > > > > > > doesn't show anything relevant as far as I can see. > > > > > > > > > > > > > > > > > > > > > Chavdar > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > -- > > > > > ---- > > > > -- > ---- -- ----
