Re: The 11.1-RC3 can only boot and attach disks in "Safe mode", otherwise gets stuck attaching
2017-07-18 01:24, Mark Johnston wrote: Are you able to break into the debugger at this point? Try setting debug.kdb.break_to_debugger=1 and debug.kdb.alt_break_to_debugger=1 at the loader prompt, and hit the break key, or the key sequence ~ ctrl-b once the hang occurs. At the debugger prompt, try "bt" and "show allpcpu" to start. Thank you for a prompt and good suggestion! I spent an afternoon fiddling with the machine, with mixed results. Your suggestion to break into debugger did not work, there was no reaction to or to ~ ctrl-b. So I embarked on rebuilding the RC3 kernel with options KDB options DDB options BREAK_TO_DEBUGGER options ALT_BREAK_TO_DEBUGGER options INVARIANTS options INVARIANT_SUPPORT options WITNESS options WITNESS_SKIPSPIN but then I realized the key is mapped-to by: alt ctrl , which now does break into debugger - but not so early where the holdup occurs. The WITNESS produced some LOR warnings, but that is probably ok. I came across a trace just before the problem area, but it flows by so fast on a vt console and only the last 40 or so lines remain on the screen (I have a photo), which do not look like revealing much. Unfortunately this machine does not have a serial interface. So in my last attempt I rebuilt a kernel with INVARIANTS but without WITNESS - and now I cannot reproduce the problem, with or without a "safe mode". What is interesting here that now the da0..da3 disks are attached first, and only then the ada disks - and even within the group of disks on the same controller their order has been shuffled - no idea what could have caused it - and it may have avoided the problem by doing so. Will play some more with this tomorrow... Mark On Tue, Jul 18, 2017 at 01:01:16AM +0200, Mark Martinec wrote: Upgrading 11.0-RELEASE-p11 to 11.1-RC3 using the usual freebsd-update upgrade method I ended up with a system which gets stuck while trying to attach the second set of disks. This happened already after the first phase of the upgrade procedure (installing and re-booting with a new kernel). The first set of disks (ada0 .. ada2) are attached successfully, also a cd0, but then when the first of the set of four (a regular spinning disk) on an LSI controller is to be attached, the boot procedure just gets stuck there: kernel: ada1: 300.000MB/s transfers (SATA 2.x, PIO4, PIO 8192bytes) kernel: ada1: Command Queueing enabled kernel: ada1: 305245MB (625142448 512 byte sectors) kernel: ada2 at ahcich6 bus 0 scbus8 target 0 lun 0 kernel: ada2: ATA8-ACS SATA 3.x device kernel: ada2: Serial Number OCZ-O1L6RF591R09Z5C8 kernel: ada2: 300.000MB/s transfers (SATA 2.x, PIO4, PIO 8192bytes) kernel: ada2: Command Queueing enabled kernel: ada2: 114473MB (234441648 512 byte sectors) kernel: ada2: quirks=0x1<4K> kernel: da0 at mps0 bus 0 scbus0 target 2 lun 0 (stuck here, keyboard not responding, fans rising their pitch, presumably CPU is spinning) [...] ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Kernel Panic of 10.2-RELEASE
> Le 18 juil. 2017 à 18:47, Daniel Genis a écrit : > > Hello, Hello, > Take a look at this commit: https://github.com/freebsd/freebsd/commit/d99ba5c > It might be the issue you're encountering. Yes, it is. Here ’s the corresponding PR : https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=207464 If I understand comments correctly, we have the same issue in 10.3 as well. So the solution is to avoid destroy snapshots, or upgrade to 11.0. Or patch the kernel myself. Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Kernel Panic of 10.2-RELEASE
Hello, Take a look at this commit: https://github.com/freebsd/freebsd/commit/d99ba5c It might be the issue you're encountering. With kind regards, Daniel On 18 July 2017 18:33:02 CEST, "Stéphane Dupille via freebsd-stable" wrote: >Hello, > >My server is running 10.2-RELEASE (yes, I need to upgrade it, but it >works like a charm). Today, I launched this command, as root : ># zfs destroy -r zroot@attic >and the machine crashed : > >Jul 18 18:09:40 penitencier syslogd: kernel boot file is >/boot/kernel/kernel >Jul 18 18:09:40 penitencier kernel: vputx: negative ref count >Jul 18 18:09:40 penitencier kernel: 0xf8023037f000: tag zfs, type >VDIR >Jul 18 18:09:40 penitencier kernel: usecount 0, writecount 0, refcount >0 mountedhere 0 >Jul 18 18:09:40 penitencier kernel: flags (VI_FREE) >Jul 18 18:09:40 penitencier kernel: VI_LOCKedlock type zfs: EXCL by >thread 0xf8014f242940 (pid 60698, zfs, tid 100747) >Jul 18 18:09:40 penitencier kernel: panic: vputx: negative ref cnt >Jul 18 18:09:40 penitencier kernel: cpuid = 1 >Jul 18 18:09:40 penitencier kernel: KDB: stack backtrace: >Jul 18 18:09:40 penitencier kernel: #0 0x80984ef0 at >kdb_backtrace+0x60 >Jul 18 18:09:40 penitencier kernel: #1 0x80948aa6 at >vpanic+0x126 >Jul 18 18:09:40 penitencier kernel: #2 0x80948973 at panic+0x43 >Jul 18 18:09:40 penitencier kernel: #3 0x809eb7d5 at >vputx+0x2d5 >Jul 18 18:09:40 penitencier kernel: #4 0x809e4f59 at >dounmount+0x689 >Jul 18 18:09:40 penitencier kernel: #5 0x81a5fdd4 at >zfs_unmount_snap+0x114 >Jul 18 18:09:40 penitencier kernel: #6 0x81a62fc1 at >zfs_ioc_destroy_snaps+0xc1 >Jul 18 18:09:40 penitencier kernel: #7 0x81a61ae0 at >zfsdev_ioctl+0x5f0 >Jul 18 18:09:40 penitencier kernel: #8 0x80830019 at >devfs_ioctl_f+0x139 >Jul 18 18:09:40 penitencier kernel: #9 0x8099cde5 at >kern_ioctl+0x255 >Jul 18 18:09:40 penitencier kernel: #10 0x8099cae0 at >sys_ioctl+0x140 >Jul 18 18:09:40 penitencier kernel: #11 0x80d4b3e7 at >amd64_syscall+0x357 >Jul 18 18:09:40 penitencier kernel: #12 0x80d30acb at >Xfast_syscall+0xfb >Jul 18 18:09:40 penitencier kernel: Uptime: 5d6h0m11s > >This is all I found in logs. I have only a remote access to this >machine so I have no clue of what was printed on console. > >I use zfs on top of geom_eli. > >Here is a uname -v : >FreeBSD penitencier.dalton-brothers.org 10.2-RELEASE-p9 FreeBSD >10.2-RELEASE-p9 #0: Thu Jan 14 01:32:46 UTC 2016 >r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 > >After rebooting, the machine works well, as far as I can see : >root@penitencier:/var/log # zpool status > pool: zboot > state: ONLINE >scan: scrub repaired 0 in 0h0m with 0 errors on Wed Nov 12 11:20:33 >2014 >config: > > NAME STATE READ WRITE CKSUM > zboot ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > gpt/boot0 ONLINE 0 0 0 > gpt/boot1 ONLINE 0 0 0 > >errors: No known data errors > > pool: zroot > state: ONLINE >scan: resilvered 6,56M in 0h0m with 0 errors on Tue Jul 18 18:13:23 >2017 >config: > > NAME STATE READ WRITE CKSUM > zroot ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > da0p4.eli ONLINE 0 0 0 > da1p4.eli ONLINE 0 0 0 > >errors: No known data errors > > >(the pool has been resilvered because I boot once, but put a wrong >passphrase in geli for one of the two drives, so it booted with only >one disk) > >What should I do now ? launch a zfs scrub ? I’m a bit afraid of making >it panic again. Should I consider that I got unlucky once ? >(please don’t tell me to upgrade it : I’m currently trying to install a >new server, and I will migrate to it very soon). > >Thanks. > >___ >freebsd-stable@freebsd.org mailing list >https://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to >"freebsd-stable-unsubscr...@freebsd.org" -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Kernel Panic of 10.2-RELEASE
Hello, My server is running 10.2-RELEASE (yes, I need to upgrade it, but it works like a charm). Today, I launched this command, as root : # zfs destroy -r zroot@attic and the machine crashed : Jul 18 18:09:40 penitencier syslogd: kernel boot file is /boot/kernel/kernel Jul 18 18:09:40 penitencier kernel: vputx: negative ref count Jul 18 18:09:40 penitencier kernel: 0xf8023037f000: tag zfs, type VDIR Jul 18 18:09:40 penitencier kernel: usecount 0, writecount 0, refcount 0 mountedhere 0 Jul 18 18:09:40 penitencier kernel: flags (VI_FREE) Jul 18 18:09:40 penitencier kernel: VI_LOCKedlock type zfs: EXCL by thread 0xf8014f242940 (pid 60698, zfs, tid 100747) Jul 18 18:09:40 penitencier kernel: panic: vputx: negative ref cnt Jul 18 18:09:40 penitencier kernel: cpuid = 1 Jul 18 18:09:40 penitencier kernel: KDB: stack backtrace: Jul 18 18:09:40 penitencier kernel: #0 0x80984ef0 at kdb_backtrace+0x60 Jul 18 18:09:40 penitencier kernel: #1 0x80948aa6 at vpanic+0x126 Jul 18 18:09:40 penitencier kernel: #2 0x80948973 at panic+0x43 Jul 18 18:09:40 penitencier kernel: #3 0x809eb7d5 at vputx+0x2d5 Jul 18 18:09:40 penitencier kernel: #4 0x809e4f59 at dounmount+0x689 Jul 18 18:09:40 penitencier kernel: #5 0x81a5fdd4 at zfs_unmount_snap+0x114 Jul 18 18:09:40 penitencier kernel: #6 0x81a62fc1 at zfs_ioc_destroy_snaps+0xc1 Jul 18 18:09:40 penitencier kernel: #7 0x81a61ae0 at zfsdev_ioctl+0x5f0 Jul 18 18:09:40 penitencier kernel: #8 0x80830019 at devfs_ioctl_f+0x139 Jul 18 18:09:40 penitencier kernel: #9 0x8099cde5 at kern_ioctl+0x255 Jul 18 18:09:40 penitencier kernel: #10 0x8099cae0 at sys_ioctl+0x140 Jul 18 18:09:40 penitencier kernel: #11 0x80d4b3e7 at amd64_syscall+0x357 Jul 18 18:09:40 penitencier kernel: #12 0x80d30acb at Xfast_syscall+0xfb Jul 18 18:09:40 penitencier kernel: Uptime: 5d6h0m11s This is all I found in logs. I have only a remote access to this machine so I have no clue of what was printed on console. I use zfs on top of geom_eli. Here is a uname -v : FreeBSD penitencier.dalton-brothers.org 10.2-RELEASE-p9 FreeBSD 10.2-RELEASE-p9 #0: Thu Jan 14 01:32:46 UTC 2016 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 After rebooting, the machine works well, as far as I can see : root@penitencier:/var/log # zpool status pool: zboot state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Wed Nov 12 11:20:33 2014 config: NAME STATE READ WRITE CKSUM zboot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/boot0 ONLINE 0 0 0 gpt/boot1 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: resilvered 6,56M in 0h0m with 0 errors on Tue Jul 18 18:13:23 2017 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da0p4.eli ONLINE 0 0 0 da1p4.eli ONLINE 0 0 0 errors: No known data errors (the pool has been resilvered because I boot once, but put a wrong passphrase in geli for one of the two drives, so it booted with only one disk) What should I do now ? launch a zfs scrub ? I’m a bit afraid of making it panic again. Should I consider that I got unlucky once ? (please don’t tell me to upgrade it : I’m currently trying to install a new server, and I will migrate to it very soon). Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"