Kristof, With a GENERIC kernel it does NOT happen. I do have a different iflib related panic at reboot, but I’ll report that separately.
I brought the two config files closer together and found out that if I remove if_bridge from the config file and have it loaded dynamically when the bridge is created, the problem no longer happens and everything works ok. Peter > On 20 Nov 2020, at 15:53, Kristof Provost <k...@freebsd.org> wrote: > > I still can’t reproduce that panic. > > Does it happen immediately after you start a vnet jail? > > Does it also happen with a GENERIC kernel? > > Regards, > Kristof > > On 20 Nov 2020, at 14:53, Peter Blok wrote: > >> The panic with ipsec code in the backtrace was already very strange. I was >> using IPsec, but only on one interface totally separate from the members of >> the bridge as well as the bridge itself. The jails were not doing any ipsec >> as well. Note that panic was a while ago and it was after the 1st bridge >> epochification was done on stable-12 which was later backed out >> >> Today the system is no longer using ipsec, but it is still compiled in. I >> can remove it if need be for a test >> >> >> src.conf >> WITHOUT_KERBEROS=yes >> WITHOUT_GSSAPI=yes >> WITHOUT_SENDMAIL=true >> WITHOUT_MAILWRAPPER=true >> WITHOUT_DMAGENT=true >> WITHOUT_GAMES=true >> WITHOUT_IPFILTER=true >> WITHOUT_UNBOUND=true >> WITHOUT_PROFILE=true >> WITHOUT_ATM=true >> WITHOUT_BSNMP=true >> #WITHOUT_CROSS_COMPILER=true >> WITHOUT_DEBUG_FILES=true >> WITHOUT_DICT=true >> WITHOUT_FLOPPY=true >> WITHOUT_HTML=true >> WITHOUT_HYPERV=true >> WITHOUT_NDIS=true >> WITHOUT_NIS=true >> WITHOUT_PPP=true >> WITHOUT_TALK=true >> WITHOUT_TESTS=true >> WITHOUT_WIRELESS=true >> #WITHOUT_LIB32=true >> WITHOUT_LPR=true >> >> make.conf >> KERNCONF=BHYVE >> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp >> if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common linuxkpi >> linprocfs linsysfs ext2fs >> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8 >> OPTIONS_UNSET=DOCS NLS MANPAGES >> >> BHYVE >> cpu HAMMER >> ident BHYVE >> >> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols >> makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support >> >> options CAMDEBUG >> >> options SCHED_ULE # ULE scheduler >> options PREEMPTION # Enable kernel thread preemption >> options INET # InterNETworking >> options INET6 # IPv6 communications protocols >> options IPSEC >> options TCP_OFFLOAD # TCP offload >> options TCP_RFC7413 # TCP FASTOPEN >> options SCTP # Stream Control Transmission Protocol >> options FFS # Berkeley Fast Filesystem >> options SOFTUPDATES # Enable FFS soft updates support >> options UFS_ACL # Support for access control lists >> options UFS_DIRHASH # Improve performance on big directories >> options UFS_GJOURNAL # Enable gjournal-based UFS journaling >> options QUOTA # Enable disk quotas for UFS >> options SUIDDIR >> options NFSCL # Network Filesystem Client >> options NFSD # Network Filesystem Server >> options NFSLOCKD # Network Lock Manager >> options MSDOSFS # MSDOS Filesystem >> options CD9660 # ISO 9660 Filesystem >> options FUSEFS >> options NULLFS # NULL filesystem >> options UNIONFS >> options FDESCFS # File descriptor filesystem >> options PROCFS # Process filesystem (requires PSEUDOFS) >> options PSEUDOFS # Pseudo-filesystem framework >> options GEOM_PART_GPT # GUID Partition Tables. >> options GEOM_RAID # Soft RAID functionality. >> options GEOM_LABEL # Provides labelization >> options GEOM_ELI # Disk encryption. >> options COMPAT_FREEBSD32 # Compatible with i386 binaries >> options COMPAT_FREEBSD4 # Compatible with FreeBSD4 >> options COMPAT_FREEBSD5 # Compatible with FreeBSD5 >> options COMPAT_FREEBSD6 # Compatible with FreeBSD6 >> options COMPAT_FREEBSD7 # Compatible with FreeBSD7 >> options COMPAT_FREEBSD9 # Compatible with FreeBSD9 >> options COMPAT_FREEBSD10 # Compatible with FreeBSD10 >> options COMPAT_FREEBSD11 # Compatible with FreeBSD11 >> options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI >> options KTRACE # ktrace(1) support >> options STACK # stack(9) support >> options SYSVSHM # SYSV-style shared memory >> options SYSVMSG # SYSV-style message queues >> options SYSVSEM # SYSV-style semaphores >> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time >> extensions >> options PRINTF_BUFR_SIZE=128 # Prevent printf output being >> interspersed. >> options KBD_INSTALL_CDEV # install a CDEV entry in /dev >> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) >> options AUDIT # Security event auditing >> options CAPABILITY_MODE # Capsicum capability mode >> options CAPABILITIES # Capsicum capabilities >> options MAC # TrustedBSD MAC Framework >> options MAC_PORTACL >> options MAC_NTPD >> options KDTRACE_FRAME # Ensure frames are compiled in >> options KDTRACE_HOOKS # Kernel DTrace hooks >> options DDB_CTF # Kernel ELF linker loads CTF data >> options INCLUDE_CONFIG_FILE # Include this file in kernel >> >> # Debugging support. Always need this: >> options KDB # Enable kernel debugger support. >> options KDB_TRACE # Print a stack trace for a panic. >> options KDB_UNATTENDED >> >> # Make an SMP-capable kernel by default >> options SMP # Symmetric MultiProcessor Kernel >> options EARLY_AP_STARTUP >> >> # CPU frequency control >> device cpufreq >> device cpuctl >> device coretemp >> >> # Bus support. >> device acpi >> options ACPI_DMAR >> device pci >> options PCI_IOV # PCI SR-IOV support >> >> device iicbus >> device iicbb >> >> device iic >> device ic >> device iicsmb >> >> device ichsmb >> device smbus >> device smb >> >> #device jedec_dimm >> >> # ATA controllers >> device ahci # AHCI-compatible SATA >> controllers >> device mvs # Marvell >> 88SX50XX/88SX60XX/88SX70XX/SoC SATA >> >> # SCSI Controllers >> device mps # LSI-Logic MPT-Fusion 2 >> >> # ATA/SCSI peripherals >> device scbus # SCSI bus (required for >> ATA/SCSI) >> device da # Direct Access (disks) >> device cd # CD >> device pass # Passthrough device (direct >> ATA/SCSI access) >> device ses # Enclosure Services (SES and >> SAF-TE) >> device sg >> >> device cfiscsi >> device ctl # CAM Target Layer >> device iscsi >> >> # atkbdc0 controls both the keyboard and the PS/2 mouse >> device atkbdc # AT keyboard controller >> device atkbd # AT keyboard >> device psm # PS/2 mouse >> >> device kbdmux # keyboard multiplexer >> >> # vt is the new video console driver >> device vt >> device vt_vga >> device vt_efifb >> >> # Serial (COM) ports >> device uart # Generic UART driver >> >> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure >> device iflib >> device em # Intel PRO/1000 Gigabit >> Ethernet Family >> device ix # Intel PRO/10GbE PCIE PF >> Ethernet >> >> # Network stack virtualization. >> options VIMAGE >> >> # Pseudo devices. >> device crypto >> device cryptodev >> device loop # Network loopback >> device random # Entropy device >> device padlock_rng # VIA Padlock RNG >> device rdrand_rng # Intel Bull Mountain RNG >> device ipmi >> device smbios >> device vpd >> device aesni # AES-NI OpenCrypto module >> device ether # Ethernet support >> device lagg >> device vlan # 802.1Q VLAN support >> device tuntap # Packet tunnel. >> device md # Memory "disks" >> device gif # IPv6 and IPv4 tunneling >> device firmware # firmware assist module >> >> device pf >> #device pflog >> #device pfsync >> >> # The `bpf' device enables the Berkeley Packet Filter. >> # Be aware of the administrative consequences of enabling this! >> # Note that 'bpf' is required for DHCP. >> device bpf # Berkeley packet filter >> >> # The `epair' device implements a virtual back-to-back connected Ethernet >> # like interface pair. >> device epair >> >> # USB support >> options USB_DEBUG # enable debug msgs >> device uhci # UHCI PCI->USB interface >> device ohci # OHCI PCI->USB interface >> device ehci # EHCI PCI->USB interface (USB >> 2.0) >> device xhci # XHCI PCI->USB interface (USB >> 3.0) >> device usb # USB Bus (required) >> device uhid >> device ukbd # Keyboard >> device umass # Disks/Mass storage - Requires >> scbus and da >> device ums >> >> device filemon >> >> device if_bridge >> >>> On 20 Nov 2020, at 12:53, Kristof Provost <k...@freebsd.org> wrote: >>> >>> Can you share your kernel config file (and src.conf / make.conf if they >>> exist)? >>> >>> This second panic is in the IPSec code. My current thinking is that your >>> kernel config is triggering a bug that’s manifesting in multiple places, >>> but not actually caused by those places. >>> >>> I’d like to be able to reproduce it so we can debug it. >>> >>> Best regards, >>> Kristof >>> >>> On 20 Nov 2020, at 12:02, Peter Blok wrote: >>>> Hi Kristof, >>>> >>>> This is 12-stable. With the previous bridge epochification that was backed >>>> out my config had a panic too. >>>> >>>> I don’t have any local modifications. I did a clean rebuild after removing >>>> /usr/obj/usr >>>> >>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and >>>> nmdm.ko as modules. Everything else is statically linked. I have removed >>>> all drivers not needed for the hardware at hand. >>>> >>>> My bridge is between two vlans from the same trunk and the jail epair >>>> devices as well as the bhyve tap devices. >>>> >>>> The panic happens when the jails are starting. >>>> >>>> I can try to narrow it down over the weekend and make the crash dump >>>> available for analysis. >>>> >>>> Previously I had the following crash with 363492 >>>> >>>> kernel trap 12 with interrupts disabled >>>> >>>> >>>> Fatal trap 12: page fault while in kernel mode >>>> cpuid = 2; apic id = 02 >>>> fault virtual address = 0xffffffff00000410 >>>> fault code = supervisor read data, page not present >>>> instruction pointer = 0x20:0xffffffff80692326 >>>> stack pointer = 0x28:0xfffffe00c06097b0 >>>> frame pointer = 0x28:0xfffffe00c06097f0 >>>> code segment = base 0x0, limit 0xfffff, type 0x1b >>>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>>> processor eflags = resume, IOPL = 0 >>>> current process = 2030 (ifconfig) >>>> trap number = 12 >>>> panic: page fault >>>> cpuid = 2 >>>> time = 1595683412 >>>> KDB: stack backtrace: >>>> #0 0xffffffff80698165 at kdb_backtrace+0x65 >>>> #1 0xffffffff8064d67b at vpanic+0x17b >>>> #2 0xffffffff8064d4f3 at panic+0x43 >>>> #3 0xffffffff809cc311 at trap_fatal+0x391 >>>> #4 0xffffffff809cc36f at trap_pfault+0x4f >>>> #5 0xffffffff809cb9b6 at trap+0x286 >>>> #6 0xffffffff809a5b28 at calltrap+0x8 >>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d >>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa >>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7 >>>> #10 0xffffffff8075274f at ifioctl+0x47f >>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7 >>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa >>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387 >>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101 >>>> >>>> >>>> >>>> >>>>> On 20 Nov 2020, at 11:30, Kristof Provost <k...@freebsd.org> wrote: >>>>> >>>>> On 20 Nov 2020, at 11:18, peter.b...@bsd4all.org >>>>> <mailto:peter.b...@bsd4all.org> wrote: >>>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or >>>>>> perhaps creates a new ). >>>>>> >>>>> We’re talking about the stable/12 branch, right? >>>>> >>>>>> This seems to happen when the jail epair is added to the bridge. >>>>>> >>>>> There must be something more to it than that. I’ve run the bridge tests >>>>> on stable/12 without issue, and this is a problem we didn’t see when the >>>>> bridge epochification initially went into stable/12. >>>>> >>>>> Do you have a custom kernel config? Other patches? What exact commands do >>>>> you run to trigger the panic? >>>>> >>>>>> kernel trap 12 with interrupts disabled >>>>>> >>>>>> >>>>>> Fatal trap 12: page fault while in kernel mode >>>>>> cpuid = 6; apic id = 06 >>>>>> fault virtual address = 0xc10 >>>>>> fault code = supervisor read data, page not present >>>>>> instruction pointer = 0x20:0xffffffff80695e76 >>>>>> stack pointer = 0x28:0xfffffe00bf14e6e0 >>>>>> frame pointer = 0x28:0xfffffe00bf14e720 >>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b >>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>>>>> processor eflags = resume, IOPL = 0 >>>>>> current process = 1686 (jail) >>>>>> trap number = 12 >>>>>> panic: page fault >>>>>> cpuid = 6 >>>>>> time = 1605811310 >>>>>> KDB: stack backtrace: >>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65 >>>>>> #1 0xffffffff80650a4b at vpanic+0x17b >>>>>> #2 0xffffffff806508c3 at panic+0x43 >>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391 >>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f >>>>>> #5 0xffffffff809cf9f6 at trap+0x286 >>>>>> #6 0xffffffff809a98c8 at calltrap+0x8 >>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d >>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa >>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120 >>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114 >>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7 >>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40 >>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387 >>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8 >>>>> >>>>> This panic is rather odd. This isn’t even the bridge code. This is during >>>>> initial creation of the vnet. I don’t really see how this could even >>>>> trigger panics. >>>>> That panic looks as if something corrupted the net_epoch_preempt, by >>>>> overwriting the epoch->e_epoch. The bridge patches only access this >>>>> variable through the well-established functions and macros. I see no >>>>> obvious way that they could corrupt it. >>>>> >>>>> Best regards, >>>>> Kristof >>> >>> >>> _______________________________________________ >>> freebsd-stable@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
smime.p7s
Description: S/MIME cryptographic signature