AMNESIA:33 and FreeBSD TCP/IP stack involvement
Hello, I've got a question about recently discovered serious vulnerabilities in certain TCP stack implementations, designated as AMNESIA:33 (as far as I could follow the recently made announcements and statements, please see, for instance, https://www.zdnet.com/article/amnesia33-vulnerabilities-impact-millions-of-smart-and-industrial-devices/). All mentioned open-source TCP stacks seem not to be related in any way with freeBSD or any derivative of the FreeBSD project, but I do not dare to make a statement about that. My question is very simple and aimes towards calming down my employees requests: is FreeBSD potentially vulnerable to this newly discovered flaw (we use mainly 12.1-RELENG, 12.2-RELENG, 12-STABLE and 13-CURRENT, latest incarnations, of course, should be least vulnerable ...). Thanks in advance, O. Hartmann pgp2PZ4NwDjdO.pgp Description: OpenPGP digital signature
Re: KLD zfs.ko: depends on kernel - not available or version mismatch
On Tue, Dec 08, 2020 at 07:10:26PM +0100, Alban Hertroys wrote: > > You didn't say that you've installed the new kernel, which at least starts > > you down the road towards a driver/kernel mismatch. You presumably have a > > non-ZFS boot+root. > > I???m fairly sure I did, actually. > > Last time I checked, "make buildworld buildkernel" was equivalent to "make > buildworld && make buildkernel", while "make kernel??? is a shorthand for > ???make buildkernel && make installkernel??? > > So, unless I???m mistaken, ???make buildworld kernel??? should be equivalent > to your first two lines. > > Nevertheless, I retried without these assumptions, the result was the same. I > forgot to ???make delete-old??? though, I rarely remember to do that??? Ah, the dangers of command syntax being close to human syntax. You're trying to do the right thing, so maybe we can sanity check that. > I had to copy over several files from /etc and /usr/local/etc and > re-installed the most important packages. This was admittedly a bit messy, it > is possible that I forgot to copy something over. > (Originally my intention was to dd the contents of the spinning disk over, > but apparently that disk has a few wonky sectors, dd failed after a few > device timeouts) ... so, no guarantee that things are totally sane. The "sane" we're looking for is how you can presumably be booting a kernel located at /boot/kernel/kernel and not have it match the kernel modules found under /boot/kernel. The fact that it is happy with the old kernel modules (presumably under found in /boot/kernel.old) may be a red herring if they're just compatible enough. I can see what I'm expecting to boot here: # grep -E 'boot\/kernel|f7b0aedd1c50' /var/log/messages | tail -2 Dec 6 08:59:04 ouroboros syslogd: kernel boot file is /boot/kernel/kernel Dec 6 08:59:04 ouroboros kernel: FreeBSD 13.0-CURRENT #237 r368388+f7b0aedd1c50-c273383(master): Sun Dec 6 08:27:47 PST 2020 So, I build my system with WITHOUT_REPRODUCIBLE_BUILD=YES in /etc/src.conf, so I can easily see my build version with uname -v: FreeBSD 13.0-CURRENT #237 r368388+f7b0aedd1c50-c273383(master): ... That matches my source tree: # git log -n1 /usr/src | grep revision svn path=/head/; revision=368388 (I've always used git for my sources, but I'm sure there is a svn equivalent.) The version I'm running is what and where I'd expect it to be: # strings -a < /boot/kernel/kernel | grep 'FreeBSD 13' | tail -1 FreeBSD 13.0-CURRENT #237 r368388+f7b0aedd1c50-c273383(master): Sun Dec 6 08:27:47 PST 2020 It certainly isn't the previous kernel: # strings -a < /boot/kernel.old/kernel | grep 'FreeBSD 13' | tail -1 FreeBSD 13.0-CURRENT #236 r368353+0252bfaea893-c273359(master): Fri Dec 4 16:55:41 PST 2020 Not sure what that'll look like with reproducible builds. The hash-check below is a decent stamp, in case the timestamps in /boot/kernel are misleading. What I have built in my source tree is the kernel/zfs module I'd expect: # md5 -r /usr/obj/usr/src/amd64.amd64/sys/GENERIC/kernel /usr/obj/usr/src/amd64.amd64/sys/GENERIC/modules/usr/src/sys/modules/zfs/zfs.ko /boot/kernel/kernel /boot/kernel/zfs.ko | sort 941ab52d075e444da6eea7fb56213e10 /boot/kernel/kernel 941ab52d075e444da6eea7fb56213e10 /usr/obj/usr/src/amd64.amd64/sys/GENERIC/kernel 97d4e0c8ffed1f75e924bf8768a95ff1 /boot/kernel/zfs.ko 97d4e0c8ffed1f75e924bf8768a95ff1 /usr/obj/usr/src/amd64.amd64/sys/GENERIC/modules/usr/src/sys/modules/zfs/zfs.ko What are you seeing after your installkernel equivalent? Your hashes won't match mine due to non-reproducible build. I'd make sure you don't have anything in /boot/modules or otherwise load any extra modules until sanity is restored (just to reduce random variables). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: KLD zfs.ko: depends on kernel - not available or version mismatch
On Dec 8, 2020, at 10:10 AM, Alban Hertroys wrote: > > So I tried again to move to HEAD: > > cd /usr/src > svn up > make buildworld -j12 > make buildkernel -j12 > make installkernel > shutdown -r now > > mount -u / > zpool import -Nf system (my /usr FS) > > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type > cd /usr/obj/usr/src/amd64.amd64/sys/GENERIC ls -l kernel /boot/kernel/kernel ls -l modules/usr/src/sys/modules/zfs/zfs.ko /boot/kernel/zfs.ko cmp modules/usr/src/sys/modules/zfs/zfs.ko /boot/kernel/zfs.ko The kernel should be *older* than zfs.ko and both instances of zfs.ko should be identical. One other thought is to manually do kldload opensolaris.ko zfs.ko In single user mode before using zpool. I have opensolaris_load="YES" zfs_load="YES" in /boot/loader.conf but you may not want to add those until you know zfs works. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: KLD zfs.ko: depends on kernel - not available or version mismatch
On Tue, 8 Dec 2020 19:10:26 +0100 Alban Hertroys wrote: > > On 8 Dec 2020, at 16:40, John Kennedy wrote: > > > > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: > >> This seems to have gotten lost in the moderate queue, but after a > >> week I am no closer to a solution, so here???s a resend: > >> > >> I???ve been trying to get a fresh world running (for the eventual > >> purpose of running amdgpu against my recent graphics adapter), but > >> I run into trouble with core loadable kernel modules, such as > >> zfs.ko from the subject. It also happens with other modules that I > >> tried randomly, for example, geom_mirror.ko. > >> > >> I updated to the latest current using svn up in /usr/src, then: > >>make clean > >>make buildworld kernel -j12 > >>shutdown -r now > >> > >> boot to single user mode > >> > >>kldload zfs > > > > I'm not sure you've provided enough information for a one-shot > > armchair diagnosis, but some things seem factually wrong. For > > example, my normal rebuild procedure is: > > > > cd /usr/src && make buildworld && make buildkernel > > make installkernel > > shutdown -r now > > > > cd /usr/src && mergemaster -pi > > make installworld > > mergemaster -Fi > > make -DBATCH_DELETE_OLD_FILES delete-old > > Aha! So that’s how to prevent having to press ‘y’ for every > deprecated file! > > > shutdown -r now > > > > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs > > > > (I'm on a desktop system here. You haven't described your setup.) > > This is also a desktop system. > > > You didn't say that you've installed the new kernel, which at least > > starts you down the road towards a driver/kernel mismatch. You > > presumably have a non-ZFS boot+root. > > I’m fairly sure I did, actually. > > Last time I checked, "make buildworld buildkernel" was equivalent to > "make buildworld && make buildkernel", while "make kernel” is a > shorthand for “make buildkernel && make installkernel” > > So, unless I’m mistaken, “make buildworld kernel” should be > equivalent to your first two lines. > > Nevertheless, I retried without these assumptions, the result was the > same. I forgot to “make delete-old” though, I rarely remember to do > that… > > > Did you mess around with the ZFS from ports (ZoL -> ZoF) > > at some point so you're not using the kernel's ZFS drivers? What > > ZFS entries do you have in /etc/loader.conf, /etc/rc.conf, and some > > of the varients that may get dragged in? (see rc.conf(5) for > > possibilities) > > Nope, stock modules here. > > > At the bottom of your email, you say / is UFS and /usr is ZFS, but > > I guess we have the extra fun of wondering what is under /usr on > > your /? If you have a pre-ZFS /usr that is populated by something > > now presumably very old (because all the new, current stuff went > > onto ZFS /usr, now unavailable). > > There is no populated directory /usr on the UFS file-system. This > install was created on a fresh NVME disk based on an existing install > on a spinning platter. The install happened with /usr mounted at the > ZFS file-system. > > I had to copy over several files from /etc and /usr/local/etc and > re-installed the most important packages. This was admittedly a bit > messy, it is possible that I forgot to copy something over. > (Originally my intention was to dd the contents of the spinning disk > over, but apparently that disk has a few wonky sectors, dd failed > after a few device timeouts) > > > I did sort-of manage to fix things, but recent kernels keep causing > the same issue: > > I noticed that uname -a said I was at revision 366335, while I had > the source tree up-to-date. For a test, I reverted back to that > revision and went through: make buildworld make buildkernel > > Which broke on /usr/local/sys/drm-current-kmod, which I turned out to > have installed through pkg. There have been changes to the linux_kpi > shortly after above revision - probably what broke compatibility > between HEAD and r366335. > > After removing that pkg, the kernel built and installed, world > installed fine too and I have a working system again, with kernel and > world in sync. > > So I tried again to move to HEAD: > > cd /usr/src > svn up > make buildworld -j12 > make buildkernel -j12 > make installkernel > shutdown -r now > > mount -u / > zpool import -Nf system (my /usr FS) > > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type > > > >> Which results in dmesg messages: > >> > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - no
Re: KLD zfs.ko: depends on kernel - not available or version mismatch
> On 8 Dec 2020, at 16:40, John Kennedy wrote: > > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: >> This seems to have gotten lost in the moderate queue, but after a week I am >> no closer to a solution, so here???s a resend: >> >> I???ve been trying to get a fresh world running (for the eventual purpose of >> running amdgpu against my recent graphics adapter), but I run into trouble >> with core loadable kernel modules, such as zfs.ko from the subject. It also >> happens with other modules that I tried randomly, for example, >> geom_mirror.ko. >> >> I updated to the latest current using svn up in /usr/src, then: >> make clean >> make buildworld kernel -j12 >> shutdown -r now >> >> boot to single user mode >> >> kldload zfs > > I'm not sure you've provided enough information for a one-shot armchair > diagnosis, but some things seem factually wrong. For example, my normal > rebuild procedure is: > > cd /usr/src && make buildworld && make buildkernel > make installkernel > shutdown -r now > > cd /usr/src && mergemaster -pi > make installworld > mergemaster -Fi > make -DBATCH_DELETE_OLD_FILES delete-old Aha! So that’s how to prevent having to press ‘y’ for every deprecated file! > shutdown -r now > > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs > > (I'm on a desktop system here. You haven't described your setup.) This is also a desktop system. > You didn't say that you've installed the new kernel, which at least starts > you down the road towards a driver/kernel mismatch. You presumably have a > non-ZFS boot+root. I’m fairly sure I did, actually. Last time I checked, "make buildworld buildkernel" was equivalent to "make buildworld && make buildkernel", while "make kernel” is a shorthand for “make buildkernel && make installkernel” So, unless I’m mistaken, “make buildworld kernel” should be equivalent to your first two lines. Nevertheless, I retried without these assumptions, the result was the same. I forgot to “make delete-old” though, I rarely remember to do that… > Did you mess around with the ZFS from ports (ZoL -> ZoF) > at some point so you're not using the kernel's ZFS drivers? What ZFS > entries do you have in /etc/loader.conf, /etc/rc.conf, and some of the > varients that may get dragged in? (see rc.conf(5) for possibilities) Nope, stock modules here. > At the bottom of your email, you say / is UFS and /usr is ZFS, but I guess we > have the extra fun of wondering what is under /usr on your /? If you have a > pre-ZFS /usr that is populated by something now presumably very old (because > all the new, current stuff went onto ZFS /usr, now unavailable). There is no populated directory /usr on the UFS file-system. This install was created on a fresh NVME disk based on an existing install on a spinning platter. The install happened with /usr mounted at the ZFS file-system. I had to copy over several files from /etc and /usr/local/etc and re-installed the most important packages. This was admittedly a bit messy, it is possible that I forgot to copy something over. (Originally my intention was to dd the contents of the spinning disk over, but apparently that disk has a few wonky sectors, dd failed after a few device timeouts) I did sort-of manage to fix things, but recent kernels keep causing the same issue: I noticed that uname -a said I was at revision 366335, while I had the source tree up-to-date. For a test, I reverted back to that revision and went through: make buildworld make buildkernel Which broke on /usr/local/sys/drm-current-kmod, which I turned out to have installed through pkg. There have been changes to the linux_kpi shortly after above revision - probably what broke compatibility between HEAD and r366335. After removing that pkg, the kernel built and installed, world installed fine too and I have a working system again, with kernel and world in sync. So I tried again to move to HEAD: cd /usr/src svn up make buildworld -j12 make buildkernel -j12 make installkernel shutdown -r now mount -u / zpool import -Nf system (my /usr FS) KLD zfs.ko: depends on kernel - not available or version mismatch linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> Which results in dmesg messages: >> >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > > Be sure to check
Re: panic: general protection fault from uipc_sockaddr+0x4c
On 12/8/20, Mark Johnston wrote: > On Tue, Dec 08, 2020 at 04:40:16PM +0100, Mateusz Guzik wrote: >> I think this is a long standing bug against exiting processes. >> >> filedesc_out only increments *hold* count, but that does not prevent >> fdescfree_fds from progressing and freeing everything without any >> locks held. > > I think it is fallout from r36: before that, fdescfree() acquired > and released the exclusive fd table lock between decrementing > fdp->fd_refcount and calling fdescfree_fds(). This would serialize with > the loop in kern_proc_fildesc_out(), which checks fdp->fd_refcount > 0 > at the beginning of each iteration. Now there is no serialization and > they can race. > Oh I forgot consumers keep checking for fd_refcount. In that case probably would be best to add sx_wait_unlocked. >> A hotfix (for mfc) would add locking around it, but a long term fix >> should wait for hold count to drain. By that point there can't be any >> new arrivals due to: >> >> PROC_LOCK(p); >> p->p_fd = NULL; >> PROC_UNLOCK(p); >> >> I'll code both later today. > -- Mateusz Guzik ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
On Tue, Dec 08, 2020 at 04:40:16PM +0100, Mateusz Guzik wrote: > I think this is a long standing bug against exiting processes. > > filedesc_out only increments *hold* count, but that does not prevent > fdescfree_fds from progressing and freeing everything without any > locks held. I think it is fallout from r36: before that, fdescfree() acquired and released the exclusive fd table lock between decrementing fdp->fd_refcount and calling fdescfree_fds(). This would serialize with the loop in kern_proc_fildesc_out(), which checks fdp->fd_refcount > 0 at the beginning of each iteration. Now there is no serialization and they can race. > A hotfix (for mfc) would add locking around it, but a long term fix > should wait for hold count to drain. By that point there can't be any > new arrivals due to: > > PROC_LOCK(p); > p->p_fd = NULL; > PROC_UNLOCK(p); > > I'll code both later today. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
On Tue, Dec 08, 2020 at 10:30:41AM -0500, Mark Johnston wrote: > On Tue, Dec 08, 2020 at 12:47:18PM +0100, Peter Holm wrote: > > I just got this panic: > > > > Fatal trap 9: general protection fault while in kernel mode > > cpuid = 9; apic id = 09 > > instruction pointer = 0x20:0x80bc6e22 > > stack pointer = 0x28:0xfe0698887630 > > frame pointer = 0x28:0xfe06988876b0 > > code segment = base 0x0, limit 0xf, type 0x1b > >= DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 45966 (fstat) > > trap number = 9 > > panic: general protection fault > > cpuid = 9 > > time = 1607416693 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > > 0xfe0698887340 > > vpanic() at vpanic+0x181/frame 0xfe0698887390 > > panic() at panic+0x43/frame 0xfe06988873f0 > > trap_fatal() at trap_fatal+0x387/frame 0xfe0698887450 > > trap() at trap+0xa4/frame 0xfe0698887560 > > calltrap() at calltrap+0x8/frame 0xfe0698887560 > > --- trap 0x9, rip = 0x80bc6e22, rsp = 0xfe0698887630, rbp = > > 0xfe06988876b0 --- > > __mtx_lock_sleep() at __mtx_lock_sleep+0xd2/frame 0xfe06988876b0 > > __mtx_lock_flags() at __mtx_lock_flags+0xe5/frame 0xfe0698887700 > > uipc_sockaddr() at uipc_sockaddr+0x4c/frame 0xfe0698887730 > > soo_fill_kinfo() at soo_fill_kinfo+0x11e/frame 0xfe0698887770 > > kern_proc_filedesc_out() at kern_proc_filedesc_out+0xb57/frame > > 0xfe0698887810 > > sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x7d/frame > > 0xfe0698887890 > > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame > > 0xfe06988878e0 > > sysctl_root() at sysctl_root+0x20d/frame 0xfe0698887960 > > userland_sysctl() at userland_sysctl+0x180/frame 0xfe0698887a10 > > sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0698887ac0 > > amd64_syscall() at amd64_syscall+0x147/frame 0xfe0698887bf0 > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe0698887bf0 > > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x8003948ea, rsp = > > 0x7fffc138, rbp = 0x7fffc170 --- > > > > https://people.freebsd.org/~pho/stress/log/log0004.txt > > So here the unpcb is freed, and indeed the file itself has been closed: > > $3 = {f_flag = 0x3, f_count = 0x0, f_data = 0x0, f_ops = 0x81901f50 > , > f_vnode = 0x0, f_cred = 0xf80248beb600, f_type = 0x2, > f_vnread_flags = 0x0, > {f_seqcount = {0x0, 0x0}, f_pipegen = 0x0}, f_nextoff = {0x0, 0x0}, > f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset = 0x0} > > However, it must have happened very recently because soo_fill_kinfo() > dereferences fp->f_data and yet we did not panic due to a null > dereference. > > kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of > this, which is supposed to prevent the table entry from being freed > since that requires the exclusive lock. > > Could you show fdp->fd_ofiles[3] and fdp->fd_map[0] from frame 26? Sure: (kgdb) p fdp->fd_files->fdt_ofiles[3] $1 = {fde_file = 0xf807306fd0f0, fde_caps = {fc_rights = {cr_rights = {0x0, 0x0}}, fc_ioctls = 0x0, fc_nioctls = 0x0, fc_fcntls = 0x0}, fde_flags = 0x0, fde_seqc = 0x2} (kgdb) p fdp->fd_map[0] $2 = 0x1f (kgdb) - Peter ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
On Tue, Dec 8, 2020 at 9:48 AM Mark Johnston wrote: > > On Tue, Dec 08, 2020 at 09:39:05AM -0600, Kyle Evans wrote: > > On Tue, Dec 8, 2020 at 9:30 AM Mark Johnston wrote: > > > kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of > > > this, which is supposed to prevent the table entry from being freed > > > since that requires the exclusive lock. > > > > > > > export_file_to_sb drops the lock without it or kern_proc_filedesc_out > > holding the file it's about to look at, though. > > Yes, but that's after it calls fo_fill_kinfo(). At that point it has > already collected the to-be-exported info in an sbuf. Whoops, indeed- sorry. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
On Tue, Dec 08, 2020 at 09:39:05AM -0600, Kyle Evans wrote: > On Tue, Dec 8, 2020 at 9:30 AM Mark Johnston wrote: > > kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of > > this, which is supposed to prevent the table entry from being freed > > since that requires the exclusive lock. > > > > export_file_to_sb drops the lock without it or kern_proc_filedesc_out > holding the file it's about to look at, though. Yes, but that's after it calls fo_fill_kinfo(). At that point it has already collected the to-be-exported info in an sbuf. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: KLD zfs.ko: depends on kernel - not available or version mismatch
On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: > This seems to have gotten lost in the moderate queue, but after a week I am > no closer to a solution, so here???s a resend: > > I???ve been trying to get a fresh world running (for the eventual purpose of > running amdgpu against my recent graphics adapter), but I run into trouble > with core loadable kernel modules, such as zfs.ko from the subject. It also > happens with other modules that I tried randomly, for example, geom_mirror.ko. > > I updated to the latest current using svn up in /usr/src, then: > make clean > make buildworld kernel -j12 > shutdown -r now > > boot to single user mode > > kldload zfs I'm not sure you've provided enough information for a one-shot armchair diagnosis, but some things seem factually wrong. For example, my normal rebuild procedure is: cd /usr/src && make buildworld && make buildkernel make installkernel shutdown -r now cd /usr/src && mergemaster -pi make installworld mergemaster -Fi make -DBATCH_DELETE_OLD_FILES delete-old shutdown -r now cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs (I'm on a desktop system here. You haven't described your setup.) You didn't say that you've installed the new kernel, which at least starts you down the road towards a driver/kernel mismatch. You presumably have a non-ZFS boot+root. Did you mess around with the ZFS from ports (ZoL -> ZoF) at some point so you're not using the kernel's ZFS drivers? What ZFS entries do you have in /etc/loader.conf, /etc/rc.conf, and some of the varients that may get dragged in? (see rc.conf(5) for possibilities) At the bottom of your email, you say / is UFS and /usr is ZFS, but I guess we have the extra fun of wondering what is under /usr on your /? If you have a pre-ZFS /usr that is populated by something now presumably very old (because all the new, current stuff went onto ZFS /usr, now unavailable). > Which results in dmesg messages: > > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type Be sure to check out /var/log/messages for extra issues. For example, with the bug I mentioned below, I couldn't load my nvidia driver and that manifested as one driver having issues because it depended on another, which had the root of the problem. > I can load the zfs kernel module from kernel.old just fine: > > ZFS filesystem version: 5 > ZFS storage pool version: features support (5000) I kicked my more bleeding-edge system over from 12.2-rel (r366954) up into 13.0-current (r367044, 1300123) on 2020/10/26. OpenZFS kicked in 2020/8/24? I think the CFT was ~2018/8/21, not sure when we had the OpenZFS ports. Current bumps the ABI version pretty frequently so I'd think you'd have tripped across versioning issues a long time ago if you had some drivers not being rebuilt. > This happens with any kernel module I???ve tried, such as geom_mirror and > amdgpu (from ports/graphics/drm-current-kmod - the latter causes a kernel > panic with kernel.old BTW). > > I???ve gone back as far as Oct 7 (before changes to kern/elf_load_obj.c off > the top of my head), looked at mailing list archives and forums etc, all to > no avail. > > I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I had > /etc/malloc.conf with the recommended symlink from UPDATING, but the same > happens with that moved out of the way. Nothing seems to help. > > Do I need to go back further to get into a usable state or is there something > else I should be doing? With very few exceptions (bug 250897, 2020/11/6), I've found 13-current bootable since 10/26 (up through my current system, 13.0 r368388 (2020/12/6). You obviously need to make sure that an extra drivers you add in are compiled against the kernel, but ZFS is typically one of those. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
I think this is a long standing bug against exiting processes. filedesc_out only increments *hold* count, but that does not prevent fdescfree_fds from progressing and freeing everything without any locks held. A hotfix (for mfc) would add locking around it, but a long term fix should wait for hold count to drain. By that point there can't be any new arrivals due to: PROC_LOCK(p); p->p_fd = NULL; PROC_UNLOCK(p); I'll code both later today. On 12/8/20, Mark Johnston wrote: > On Tue, Dec 08, 2020 at 12:47:18PM +0100, Peter Holm wrote: >> I just got this panic: >> >> Fatal trap 9: general protection fault while in kernel mode >> cpuid = 9; apic id = 09 >> instruction pointer = 0x20:0x80bc6e22 >> stack pointer = 0x28:0xfe0698887630 >> frame pointer = 0x28:0xfe06988876b0 >> code segment = base 0x0, limit 0xf, type 0x1b >>= DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 45966 (fstat) >> trap number = 9 >> panic: general protection fault >> cpuid = 9 >> time = 1607416693 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >> 0xfe0698887340 >> vpanic() at vpanic+0x181/frame 0xfe0698887390 >> panic() at panic+0x43/frame 0xfe06988873f0 >> trap_fatal() at trap_fatal+0x387/frame 0xfe0698887450 >> trap() at trap+0xa4/frame 0xfe0698887560 >> calltrap() at calltrap+0x8/frame 0xfe0698887560 >> --- trap 0x9, rip = 0x80bc6e22, rsp = 0xfe0698887630, rbp = >> 0xfe06988876b0 --- >> __mtx_lock_sleep() at __mtx_lock_sleep+0xd2/frame 0xfe06988876b0 >> __mtx_lock_flags() at __mtx_lock_flags+0xe5/frame 0xfe0698887700 >> uipc_sockaddr() at uipc_sockaddr+0x4c/frame 0xfe0698887730 >> soo_fill_kinfo() at soo_fill_kinfo+0x11e/frame 0xfe0698887770 >> kern_proc_filedesc_out() at kern_proc_filedesc_out+0xb57/frame >> 0xfe0698887810 >> sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x7d/frame >> 0xfe0698887890 >> sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame >> 0xfe06988878e0 >> sysctl_root() at sysctl_root+0x20d/frame 0xfe0698887960 >> userland_sysctl() at userland_sysctl+0x180/frame 0xfe0698887a10 >> sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0698887ac0 >> amd64_syscall() at amd64_syscall+0x147/frame 0xfe0698887bf0 >> fast_syscall_common() at fast_syscall_common+0xf8/frame >> 0xfe0698887bf0 >> --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x8003948ea, rsp = >> 0x7fffc138, rbp = 0x7fffc170 --- >> >> https://people.freebsd.org/~pho/stress/log/log0004.txt > > So here the unpcb is freed, and indeed the file itself has been closed: > > $3 = {f_flag = 0x3, f_count = 0x0, f_data = 0x0, f_ops = 0x81901f50 > , > f_vnode = 0x0, f_cred = 0xf80248beb600, f_type = 0x2, > f_vnread_flags = 0x0, > {f_seqcount = {0x0, 0x0}, f_pipegen = 0x0}, f_nextoff = {0x0, 0x0}, > f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset = 0x0} > > However, it must have happened very recently because soo_fill_kinfo() > dereferences fp->f_data and yet we did not panic due to a null > dereference. > > kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of > this, which is supposed to prevent the table entry from being freed > since that requires the exclusive lock. > > Could you show fdp->fd_ofiles[3] and fdp->fd_map[0] from frame 26? > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > -- Mateusz Guzik ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
On Tue, Dec 8, 2020 at 9:30 AM Mark Johnston wrote: > > On Tue, Dec 08, 2020 at 12:47:18PM +0100, Peter Holm wrote: > > I just got this panic: > > > > Fatal trap 9: general protection fault while in kernel mode > > cpuid = 9; apic id = 09 > > instruction pointer = 0x20:0x80bc6e22 > > stack pointer = 0x28:0xfe0698887630 > > frame pointer = 0x28:0xfe06988876b0 > > code segment = base 0x0, limit 0xf, type 0x1b > >= DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 45966 (fstat) > > trap number = 9 > > panic: general protection fault > > cpuid = 9 > > time = 1607416693 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > > 0xfe0698887340 > > vpanic() at vpanic+0x181/frame 0xfe0698887390 > > panic() at panic+0x43/frame 0xfe06988873f0 > > trap_fatal() at trap_fatal+0x387/frame 0xfe0698887450 > > trap() at trap+0xa4/frame 0xfe0698887560 > > calltrap() at calltrap+0x8/frame 0xfe0698887560 > > --- trap 0x9, rip = 0x80bc6e22, rsp = 0xfe0698887630, rbp = > > 0xfe06988876b0 --- > > __mtx_lock_sleep() at __mtx_lock_sleep+0xd2/frame 0xfe06988876b0 > > __mtx_lock_flags() at __mtx_lock_flags+0xe5/frame 0xfe0698887700 > > uipc_sockaddr() at uipc_sockaddr+0x4c/frame 0xfe0698887730 > > soo_fill_kinfo() at soo_fill_kinfo+0x11e/frame 0xfe0698887770 > > kern_proc_filedesc_out() at kern_proc_filedesc_out+0xb57/frame > > 0xfe0698887810 > > sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x7d/frame > > 0xfe0698887890 > > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame > > 0xfe06988878e0 > > sysctl_root() at sysctl_root+0x20d/frame 0xfe0698887960 > > userland_sysctl() at userland_sysctl+0x180/frame 0xfe0698887a10 > > sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0698887ac0 > > amd64_syscall() at amd64_syscall+0x147/frame 0xfe0698887bf0 > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe0698887bf0 > > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x8003948ea, rsp = > > 0x7fffc138, rbp = 0x7fffc170 --- > > > > https://people.freebsd.org/~pho/stress/log/log0004.txt > > So here the unpcb is freed, and indeed the file itself has been closed: > > $3 = {f_flag = 0x3, f_count = 0x0, f_data = 0x0, f_ops = 0x81901f50 > , > f_vnode = 0x0, f_cred = 0xf80248beb600, f_type = 0x2, > f_vnread_flags = 0x0, > {f_seqcount = {0x0, 0x0}, f_pipegen = 0x0}, f_nextoff = {0x0, 0x0}, > f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset = 0x0} > > However, it must have happened very recently because soo_fill_kinfo() > dereferences fp->f_data and yet we did not panic due to a null > dereference. > > kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of > this, which is supposed to prevent the table entry from being freed > since that requires the exclusive lock. > export_file_to_sb drops the lock without it or kern_proc_filedesc_out holding the file it's about to look at, though. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: panic: general protection fault from uipc_sockaddr+0x4c
On Tue, Dec 08, 2020 at 12:47:18PM +0100, Peter Holm wrote: > I just got this panic: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 9; apic id = 09 > instruction pointer = 0x20:0x80bc6e22 > stack pointer = 0x28:0xfe0698887630 > frame pointer = 0x28:0xfe06988876b0 > code segment = base 0x0, limit 0xf, type 0x1b >= DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 45966 (fstat) > trap number = 9 > panic: general protection fault > cpuid = 9 > time = 1607416693 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0698887340 > vpanic() at vpanic+0x181/frame 0xfe0698887390 > panic() at panic+0x43/frame 0xfe06988873f0 > trap_fatal() at trap_fatal+0x387/frame 0xfe0698887450 > trap() at trap+0xa4/frame 0xfe0698887560 > calltrap() at calltrap+0x8/frame 0xfe0698887560 > --- trap 0x9, rip = 0x80bc6e22, rsp = 0xfe0698887630, rbp = > 0xfe06988876b0 --- > __mtx_lock_sleep() at __mtx_lock_sleep+0xd2/frame 0xfe06988876b0 > __mtx_lock_flags() at __mtx_lock_flags+0xe5/frame 0xfe0698887700 > uipc_sockaddr() at uipc_sockaddr+0x4c/frame 0xfe0698887730 > soo_fill_kinfo() at soo_fill_kinfo+0x11e/frame 0xfe0698887770 > kern_proc_filedesc_out() at kern_proc_filedesc_out+0xb57/frame > 0xfe0698887810 > sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x7d/frame > 0xfe0698887890 > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame > 0xfe06988878e0 > sysctl_root() at sysctl_root+0x20d/frame 0xfe0698887960 > userland_sysctl() at userland_sysctl+0x180/frame 0xfe0698887a10 > sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0698887ac0 > amd64_syscall() at amd64_syscall+0x147/frame 0xfe0698887bf0 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe0698887bf0 > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x8003948ea, rsp = > 0x7fffc138, rbp = 0x7fffc170 --- > > https://people.freebsd.org/~pho/stress/log/log0004.txt So here the unpcb is freed, and indeed the file itself has been closed: $3 = {f_flag = 0x3, f_count = 0x0, f_data = 0x0, f_ops = 0x81901f50 , f_vnode = 0x0, f_cred = 0xf80248beb600, f_type = 0x2, f_vnread_flags = 0x0, {f_seqcount = {0x0, 0x0}, f_pipegen = 0x0}, f_nextoff = {0x0, 0x0}, f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset = 0x0} However, it must have happened very recently because soo_fill_kinfo() dereferences fp->f_data and yet we did not panic due to a null dereference. kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of this, which is supposed to prevent the table entry from being freed since that requires the exclusive lock. Could you show fdp->fd_ofiles[3] and fdp->fd_map[0] from frame 26? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
panic: general protection fault from uipc_sockaddr+0x4c
I just got this panic: Fatal trap 9: general protection fault while in kernel mode cpuid = 9; apic id = 09 instruction pointer = 0x20:0x80bc6e22 stack pointer = 0x28:0xfe0698887630 frame pointer = 0x28:0xfe06988876b0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 45966 (fstat) trap number = 9 panic: general protection fault cpuid = 9 time = 1607416693 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0698887340 vpanic() at vpanic+0x181/frame 0xfe0698887390 panic() at panic+0x43/frame 0xfe06988873f0 trap_fatal() at trap_fatal+0x387/frame 0xfe0698887450 trap() at trap+0xa4/frame 0xfe0698887560 calltrap() at calltrap+0x8/frame 0xfe0698887560 --- trap 0x9, rip = 0x80bc6e22, rsp = 0xfe0698887630, rbp = 0xfe06988876b0 --- __mtx_lock_sleep() at __mtx_lock_sleep+0xd2/frame 0xfe06988876b0 __mtx_lock_flags() at __mtx_lock_flags+0xe5/frame 0xfe0698887700 uipc_sockaddr() at uipc_sockaddr+0x4c/frame 0xfe0698887730 soo_fill_kinfo() at soo_fill_kinfo+0x11e/frame 0xfe0698887770 kern_proc_filedesc_out() at kern_proc_filedesc_out+0xb57/frame 0xfe0698887810 sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x7d/frame 0xfe0698887890 sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame 0xfe06988878e0 sysctl_root() at sysctl_root+0x20d/frame 0xfe0698887960 userland_sysctl() at userland_sysctl+0x180/frame 0xfe0698887a10 sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0698887ac0 amd64_syscall() at amd64_syscall+0x147/frame 0xfe0698887bf0 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe0698887bf0 --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x8003948ea, rsp = 0x7fffc138, rbp = 0x7fffc170 --- https://people.freebsd.org/~pho/stress/log/log0004.txt - Peter ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"