Kristof,

With a GENERIC kernel it does NOT happen. I do have a different iflib related 
panic at reboot, but I’ll report that separately.

I brought the two config files closer together and found out that if I remove 
if_bridge from the config file and have it loaded dynamically when the bridge 
is created, the problem no longer happens and everything works ok.

Peter

> On 20 Nov 2020, at 15:53, Kristof Provost <k...@freebsd.org> wrote:
> 
> I still can’t reproduce that panic.
> 
> Does it happen immediately after you start a vnet jail?
> 
> Does it also happen with a GENERIC kernel?
> 
> Regards,
> Kristof
> 
> On 20 Nov 2020, at 14:53, Peter Blok wrote:
> 
>> The panic with ipsec code in the backtrace was already very strange. I was 
>> using IPsec, but only on one interface totally separate from the members of 
>> the bridge as well as the bridge itself. The jails were not doing any ipsec 
>> as well. Note that panic was a while ago and it was after the 1st bridge 
>> epochification was done on stable-12 which was later backed out
>> 
>> Today the system is no longer using ipsec, but it is still compiled in. I 
>> can remove it if need be for a test
>> 
>> 
>> src.conf
>> WITHOUT_KERBEROS=yes
>> WITHOUT_GSSAPI=yes
>> WITHOUT_SENDMAIL=true
>> WITHOUT_MAILWRAPPER=true
>> WITHOUT_DMAGENT=true
>> WITHOUT_GAMES=true
>> WITHOUT_IPFILTER=true
>> WITHOUT_UNBOUND=true
>> WITHOUT_PROFILE=true
>> WITHOUT_ATM=true
>> WITHOUT_BSNMP=true
>> #WITHOUT_CROSS_COMPILER=true
>> WITHOUT_DEBUG_FILES=true
>> WITHOUT_DICT=true
>> WITHOUT_FLOPPY=true
>> WITHOUT_HTML=true
>> WITHOUT_HYPERV=true
>> WITHOUT_NDIS=true
>> WITHOUT_NIS=true
>> WITHOUT_PPP=true
>> WITHOUT_TALK=true
>> WITHOUT_TESTS=true
>> WITHOUT_WIRELESS=true
>> #WITHOUT_LIB32=true
>> WITHOUT_LPR=true
>> 
>> make.conf
>> KERNCONF=BHYVE
>> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp 
>> if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common linuxkpi 
>> linprocfs linsysfs ext2fs
>> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
>> OPTIONS_UNSET=DOCS NLS MANPAGES
>> 
>> BHYVE
>> cpu          HAMMER
>> ident                BHYVE
>> 
>> makeoptions  DEBUG=-g                # Build kernel with gdb(1) debug symbols
>> makeoptions  WITH_CTF=1              # Run ctfconvert(1) for DTrace support
>> 
>> options              CAMDEBUG
>> 
>> options      SCHED_ULE               # ULE scheduler
>> options      PREEMPTION              # Enable kernel thread preemption
>> options      INET                    # InterNETworking
>> options      INET6                   # IPv6 communications protocols
>> options              IPSEC
>> options      TCP_OFFLOAD             # TCP offload
>> options              TCP_RFC7413             # TCP FASTOPEN
>> options      SCTP                    # Stream Control Transmission Protocol
>> options      FFS                     # Berkeley Fast Filesystem
>> options      SOFTUPDATES             # Enable FFS soft updates support
>> options      UFS_ACL                 # Support for access control lists
>> options      UFS_DIRHASH             # Improve performance on big directories
>> options      UFS_GJOURNAL            # Enable gjournal-based UFS journaling
>> options      QUOTA                   # Enable disk quotas for UFS
>> options              SUIDDIR
>> options      NFSCL                   # Network Filesystem Client
>> options      NFSD                    # Network Filesystem Server
>> options      NFSLOCKD                # Network Lock Manager
>> options      MSDOSFS                 # MSDOS Filesystem
>> options      CD9660                  # ISO 9660 Filesystem
>> options      FUSEFS
>> options              NULLFS                  # NULL filesystem
>> options              UNIONFS
>> options              FDESCFS                 # File descriptor filesystem
>> options      PROCFS                  # Process filesystem (requires PSEUDOFS)
>> options      PSEUDOFS                # Pseudo-filesystem framework
>> options      GEOM_PART_GPT           # GUID Partition Tables.
>> options      GEOM_RAID               # Soft RAID functionality.
>> options      GEOM_LABEL              # Provides labelization
>> options      GEOM_ELI                # Disk encryption.
>> options      COMPAT_FREEBSD32        # Compatible with i386 binaries
>> options      COMPAT_FREEBSD4         # Compatible with FreeBSD4
>> options      COMPAT_FREEBSD5         # Compatible with FreeBSD5
>> options      COMPAT_FREEBSD6         # Compatible with FreeBSD6
>> options      COMPAT_FREEBSD7         # Compatible with FreeBSD7
>> options      COMPAT_FREEBSD9         # Compatible with FreeBSD9
>> options      COMPAT_FREEBSD10        # Compatible with FreeBSD10
>> options      COMPAT_FREEBSD11        # Compatible with FreeBSD11
>> options      SCSI_DELAY=5000         # Delay (in ms) before probing SCSI
>> options      KTRACE                  # ktrace(1) support
>> options      STACK                   # stack(9) support
>> options      SYSVSHM                 # SYSV-style shared memory
>> options      SYSVMSG                 # SYSV-style message queues
>> options      SYSVSEM                 # SYSV-style semaphores
>> options      _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time 
>> extensions
>> options      PRINTF_BUFR_SIZE=128    # Prevent printf output being 
>> interspersed.
>> options      KBD_INSTALL_CDEV        # install a CDEV entry in /dev
>> options      HWPMC_HOOKS             # Necessary kernel hooks for hwpmc(4)
>> options      AUDIT                   # Security event auditing
>> options      CAPABILITY_MODE         # Capsicum capability mode
>> options      CAPABILITIES            # Capsicum capabilities
>> options      MAC                     # TrustedBSD MAC Framework
>> options      MAC_PORTACL
>> options      MAC_NTPD
>> options      KDTRACE_FRAME           # Ensure frames are compiled in
>> options      KDTRACE_HOOKS           # Kernel DTrace hooks
>> options      DDB_CTF                 # Kernel ELF linker loads CTF data
>> options      INCLUDE_CONFIG_FILE     # Include this file in kernel
>> 
>> # Debugging support.  Always need this:
>> options      KDB                     # Enable kernel debugger support.
>> options      KDB_TRACE               # Print a stack trace for a panic.
>> options      KDB_UNATTENDED
>> 
>> # Make an SMP-capable kernel by default
>> options      SMP                     # Symmetric MultiProcessor Kernel
>> options      EARLY_AP_STARTUP
>> 
>> # CPU frequency control
>> device               cpufreq
>> device               cpuctl
>> device               coretemp
>> 
>> # Bus support.
>> device               acpi
>> options      ACPI_DMAR
>> device               pci
>> options              PCI_IOV                 # PCI SR-IOV support
>> 
>> device               iicbus
>> device               iicbb
>> 
>> device               iic
>> device               ic
>> device               iicsmb
>> 
>> device               ichsmb
>> device               smbus
>> device               smb
>> 
>> #device              jedec_dimm
>> 
>> # ATA controllers
>> device               ahci                    # AHCI-compatible SATA 
>> controllers
>> device               mvs                     # Marvell 
>> 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>> 
>> # SCSI Controllers
>> device               mps                     # LSI-Logic MPT-Fusion 2
>> 
>> # ATA/SCSI peripherals
>> device               scbus                   # SCSI bus (required for 
>> ATA/SCSI)
>> device               da                      # Direct Access (disks)
>> device               cd                      # CD
>> device               pass                    # Passthrough device (direct 
>> ATA/SCSI access)
>> device               ses                     # Enclosure Services (SES and 
>> SAF-TE)
>> device               sg
>> 
>> device               cfiscsi
>> device               ctl                     # CAM Target Layer
>> device               iscsi
>> 
>> # atkbdc0 controls both the keyboard and the PS/2 mouse
>> device               atkbdc                  # AT keyboard controller
>> device               atkbd                   # AT keyboard
>> device               psm                     # PS/2 mouse
>> 
>> device               kbdmux                  # keyboard multiplexer
>> 
>> # vt is the new video console driver
>> device               vt
>> device               vt_vga
>> device               vt_efifb
>> 
>> # Serial (COM) ports
>> device               uart                    # Generic UART driver
>> 
>> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
>> device               iflib
>> device               em                      # Intel PRO/1000 Gigabit 
>> Ethernet Family
>> device               ix                      # Intel PRO/10GbE PCIE PF 
>> Ethernet
>> 
>> # Network stack virtualization.
>> options              VIMAGE
>> 
>> # Pseudo devices.
>> device               crypto
>> device               cryptodev
>> device               loop                    # Network loopback
>> device               random                  # Entropy device
>> device               padlock_rng             # VIA Padlock RNG
>> device               rdrand_rng              # Intel Bull Mountain RNG
>> device               ipmi
>> device               smbios
>> device               vpd
>> device               aesni                   # AES-NI OpenCrypto module
>> device               ether                   # Ethernet support
>> device               lagg
>> device               vlan                    # 802.1Q VLAN support
>> device               tuntap                  # Packet tunnel.
>> device               md                      # Memory "disks"
>> device               gif                     # IPv6 and IPv4 tunneling
>> device               firmware                # firmware assist module
>> 
>> device               pf
>> #device              pflog
>> #device              pfsync
>> 
>> # The `bpf' device enables the Berkeley Packet Filter.
>> # Be aware of the administrative consequences of enabling this!
>> # Note that 'bpf' is required for DHCP.
>> device               bpf                     # Berkeley packet filter
>> 
>> # The `epair' device implements a virtual back-to-back connected Ethernet
>> # like interface pair.
>> device               epair
>> 
>> # USB support
>> options      USB_DEBUG               # enable debug msgs
>> device               uhci                    # UHCI PCI->USB interface
>> device               ohci                    # OHCI PCI->USB interface
>> device               ehci                    # EHCI PCI->USB interface (USB 
>> 2.0)
>> device               xhci                    # XHCI PCI->USB interface (USB 
>> 3.0)
>> device               usb                     # USB Bus (required)
>> device               uhid
>> device               ukbd                    # Keyboard
>> device               umass                   # Disks/Mass storage - Requires 
>> scbus and da
>> device               ums
>> 
>> device               filemon
>> 
>> device               if_bridge
>> 
>>> On 20 Nov 2020, at 12:53, Kristof Provost <k...@freebsd.org> wrote:
>>> 
>>> Can you share your kernel config file (and src.conf / make.conf if they 
>>> exist)?
>>> 
>>> This second panic is in the IPSec code. My current thinking is that your 
>>> kernel config is triggering a bug that’s manifesting in multiple places, 
>>> but not actually caused by those places.
>>> 
>>> I’d like to be able to reproduce it so we can debug it.
>>> 
>>> Best regards,
>>> Kristof
>>> 
>>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>>> Hi Kristof,
>>>> 
>>>> This is 12-stable. With the previous bridge epochification that was backed 
>>>> out my config had a panic too.
>>>> 
>>>> I don’t have any local modifications. I did a clean rebuild after removing 
>>>> /usr/obj/usr
>>>> 
>>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and 
>>>> nmdm.ko as modules. Everything else is statically linked. I have removed 
>>>> all drivers not needed for the hardware at hand.
>>>> 
>>>> My bridge is between two vlans from the same trunk and the jail epair 
>>>> devices as well as the bhyve tap devices.
>>>> 
>>>> The panic happens when the jails are starting.
>>>> 
>>>> I can try to narrow it down over the weekend and make the crash dump 
>>>> available for analysis.
>>>> 
>>>> Previously I had the following crash with 363492
>>>> 
>>>> kernel trap 12 with interrupts disabled
>>>> 
>>>> 
>>>> Fatal trap 12: page fault while in kernel mode
>>>> cpuid = 2; apic id = 02
>>>> fault virtual address      = 0xffffffff00000410
>>>> fault code         = supervisor read data, page not present
>>>> instruction pointer        = 0x20:0xffffffff80692326
>>>> stack pointer              = 0x28:0xfffffe00c06097b0
>>>> frame pointer              = 0x28:0xfffffe00c06097f0
>>>> code segment               = base 0x0, limit 0xfffff, type 0x1b
>>>>                    = DPL 0, pres 1, long 1, def32 0, gran 1
>>>> processor eflags   = resume, IOPL = 0
>>>> current process            = 2030 (ifconfig)
>>>> trap number                = 12
>>>> panic: page fault
>>>> cpuid = 2
>>>> time = 1595683412
>>>> KDB: stack backtrace:
>>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>>> #2 0xffffffff8064d4f3 at panic+0x43
>>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>>> #5 0xffffffff809cb9b6 at trap+0x286
>>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 20 Nov 2020, at 11:30, Kristof Provost <k...@freebsd.org> wrote:
>>>>> 
>>>>> On 20 Nov 2020, at 11:18, peter.b...@bsd4all.org 
>>>>> <mailto:peter.b...@bsd4all.org> wrote:
>>>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or 
>>>>>> perhaps creates a new ).
>>>>>> 
>>>>> We’re talking about the stable/12 branch, right?
>>>>> 
>>>>>> This seems to happen when the jail epair is added to the bridge.
>>>>>> 
>>>>> There must be something more to it than that. I’ve run the bridge tests 
>>>>> on stable/12 without issue, and this is a problem we didn’t see when the 
>>>>> bridge epochification initially went into stable/12.
>>>>> 
>>>>> Do you have a custom kernel config? Other patches? What exact commands do 
>>>>> you run to trigger the panic?
>>>>> 
>>>>>> kernel trap 12 with interrupts disabled
>>>>>> 
>>>>>> 
>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>> cpuid = 6; apic id = 06
>>>>>> fault virtual address    = 0xc10
>>>>>> fault code               = supervisor read data, page not present
>>>>>> instruction pointer      = 0x20:0xffffffff80695e76
>>>>>> stack pointer            = 0x28:0xfffffe00bf14e6e0
>>>>>> frame pointer            = 0x28:0xfffffe00bf14e720
>>>>>> code segment             = base 0x0, limit 0xfffff, type 0x1b
>>>>>>                  = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>> processor eflags = resume, IOPL = 0
>>>>>> current process          = 1686 (jail)
>>>>>> trap number              = 12
>>>>>> panic: page fault
>>>>>> cpuid = 6
>>>>>> time = 1605811310
>>>>>> KDB: stack backtrace:
>>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>> 
>>>>> This panic is rather odd. This isn’t even the bridge code. This is during 
>>>>> initial creation of the vnet. I don’t really see how this could even 
>>>>> trigger panics.
>>>>> That panic looks as if something corrupted the net_epoch_preempt, by 
>>>>> overwriting the epoch->e_epoch. The bridge patches only access this 
>>>>> variable through the well-established functions and macros. I see no 
>>>>> obvious way that they could corrupt it.
>>>>> 
>>>>> Best regards,
>>>>> Kristof
>>> 
>>> 
>>> _______________________________________________
>>> freebsd-stable@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to