Re: panic: kernel diagnostic assertion in nd6.c on 6.6-stable amd64

2020-01-04 Thread Ben Lee

On 1/4/2020 5:46 PM, Ben Lee wrote:

Hi,

I have an amd64 system that I am using as a router/firewall for my home 
network running OpenBSD 6.6-stable with the latest syspatches. I have 
been running it without problems in an IPv4-only configuration. 
Recently, I decided I wanted to experiment with running a dual stack 
IPv4 and IPv6 network. I think I have successfully added IPv6 to my 
configuration which seemed to be functioning correctly to the best of my 
knowledge, but I have started to encounter kernel panics that crash to 
the ddb debugger after about 12 hours of uptime. I believe the panics 
might be specifically occurring when the DHCPv6 lease expires or renews 
on my external interface and the IPv6 routing table is updated. The 
panics go away when I reverted back to my IPv4-only configuration so I 
believe the panics are specific to IPv6. I have been troubleshooting and 
researching and for the past week or so and I thought I would check to 
see if some more experienced OpenBSD users might have some advice on how 
to proceed with troubleshooting.


At the end of this email, I have included the relevant section from 
/var/run/dmesg.boot related to one instance of the panic. It includes 
the boot messages, panic message, trace, and the output of ps I ran in 
ddb after the panic.


I did find a similar bug report in the mail archive for openbsd-bugs 
that looks like it was unfortunately not resolved:


https://marc.info/?l=openbsd-bugs=152587044611044=2

Some more info about my configuration that I think may be relevant:

My ISP uses DHCPv6 prefix delegation to distribute IPv6 addresses so I 
used dhcpcd from ports installed using pkg_add. I used rad for router 
advertisements on my internal interfaces and I am using SLAAC to 
auto-configure IPv6 addresses.


My system has 6 Intel NICs, em[0-5], and I also have 4 VLANs vlan[0-3] 
on em5 that I am using for my wireless AP. em0 is the external interface 
connected to my cable modem. em[1-5] are my internal interfaces. em[1-3] 
are each connected to separate computers, em4 is unconnected, and em5 
and vlan[0-3] are connected to my wireless AP.


The only other observation I had that might be related to the panic is 
regarding the "ndp info overwritten" and "cannot forward src" lines in 
my dmesg. The "ndp info overwritten" lines correspond to interfaces that 
were connected to devices/hosts that powered on at the time of the panic 
while I think that the "cannot forward src" messages correspond to em2 
and em3 which were connected to computers that were powered off at the 
time of the panic, but were previously powered on to test if they were 
receiving IPv6 addresses. I am wondering if the computers being powered 
off at the time of the DHCPv6 lease update is triggering the panic and I 
think I might test this by only enabling IPv6 on em5 and vlan[0-3] which 
are connected to my wireless AP is always on.


I would be happy to provide any other relevant info that I may have 
omitted. Thanks for reading.



Ben

##

OpenBSD 6.6 (GENERIC.MP) #3: Thu Nov 21 03:20:01 MST 2019

r...@syspatch-66-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP 


real mem = 4192743424 (3998MB)
avail mem = 4052963328 (3865MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x8d317000 (85 entries)
bios0: vendor American Megatrends Inc. version "5.12" date 07/08/2019
bios0: Protectli FW6
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP APIC FPDT MCFG SSDT FIDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT SSDT SSDT DBGP DBG2 BGRT DMAR ASF!
acpi0: wakeup devices PS2K(S0) PS2M(S0) RP09(S0) PXSX(S0) RP10(S0) 
PXSX(S0) RP11(S0) PXSX(S0) RP12(S0) PXSX(S0) RP13(S0) PXSX(S0) RP01(S0) 
PXSX(S0) RP02(S0) PXSX(S0) [...]

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.58 MHz, 06-8e-09
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,SMEP,ERMS,INVPCID,MPX,RDSEED,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN 


cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.06 MHz, 06-8e-09
cpu1: 

panic: kernel diagnostic assertion in nd6.c on 6.6-stable amd64

2020-01-04 Thread Ben Lee

Hi,

I have an amd64 system that I am using as a router/firewall for my home 
network running OpenBSD 6.6-stable with the latest syspatches. I have 
been running it without problems in an IPv4-only configuration. 
Recently, I decided I wanted to experiment with running a dual stack 
IPv4 and IPv6 network. I think I have successfully added IPv6 to my 
configuration which seemed to be functioning correctly to the best of my 
knowledge, but I have started to encounter kernel panics that crash to 
the ddb debugger after about 12 hours of uptime. I believe the panics 
might be specifically occurring when the DHCPv6 lease expires or renews 
on my external interface and the IPv6 routing table is updated. The 
panics go away when I reverted back to my IPv4-only configuration so I 
believe the panics are specific to IPv6. I have been troubleshooting and 
researching and for the past week or so and I thought I would check to 
see if some more experienced OpenBSD users might have some advice on how 
to proceed with troubleshooting.


At the end of this email, I have included the relevant section from 
/var/run/dmesg.boot related to one instance of the panic. It includes 
the boot messages, panic message, trace, and the output of ps I ran in 
ddb after the panic.


I did find a similar bug report in the mail archive for openbsd-bugs 
that looks like it was unfortunately not resolved:


https://marc.info/?l=openbsd-bugs=152587044611044=2

Some more info about my configuration that I think may be relevant:

My ISP uses DHCPv6 prefix delegation to distribute IPv6 addresses so I 
used dhcpcd from ports installed using pkg_add. I used rad for router 
advertisements on my internal interfaces and I am using SLAAC to 
auto-configure IPv6 addresses.


My system has 6 Intel NICs, em[0-5], and I also have 4 VLANs vlan[0-3] 
on em5 that I am using for my wireless AP. em0 is the external interface 
connected to my cable modem. em[1-5] are my internal interfaces. em[1-3] 
are each connected to separate computers, em4 is unconnected, and em5 
and vlan[0-3] are connected to my wireless AP.


The only other observation I had that might be related to the panic is 
regarding the "ndp info overwritten" and "cannot forward src" lines in 
my dmesg. The "ndp info overwritten" lines correspond to interfaces that 
were connected to devices/hosts that powered on at the time of the panic 
while I think that the "cannot forward src" messages correspond to em2 
and em3 which were connected to computers that were powered off at the 
time of the panic, but were previously powered on to test if they were 
receiving IPv6 addresses. I am wondering if the computers being powered 
off at the time of the DHCPv6 lease update is triggering the panic and I 
think I might test this by only enabling IPv6 on em5 and vlan[0-3] which 
are connected to my wireless AP is always on.


I would be happy to provide any other relevant info that I may have 
omitted. Thanks for reading.



Ben

##

OpenBSD 6.6 (GENERIC.MP) #3: Thu Nov 21 03:20:01 MST 2019

r...@syspatch-66-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4192743424 (3998MB)
avail mem = 4052963328 (3865MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x8d317000 (85 entries)
bios0: vendor American Megatrends Inc. version "5.12" date 07/08/2019
bios0: Protectli FW6
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP APIC FPDT MCFG SSDT FIDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT SSDT SSDT DBGP DBG2 BGRT DMAR ASF!
acpi0: wakeup devices PS2K(S0) PS2M(S0) RP09(S0) PXSX(S0) RP10(S0) 
PXSX(S0) RP11(S0) PXSX(S0) RP12(S0) PXSX(S0) RP13(S0) PXSX(S0) RP01(S0) 
PXSX(S0) RP02(S0) PXSX(S0) [...]

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.58 MHz, 06-8e-09
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,SMEP,ERMS,INVPCID,MPX,RDSEED,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN

cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.06 MHz, 06-8e-09
cpu1: 

panic: kernel diagnostic assertion in nd6.c on 6.6-stable amd64

2020-01-04 Thread ben


Hi,

I have an amd64 system that I am using as a router/firewall for my home 
network running OpenBSD 6.6-stable with the latest syspatches. I have 
been running it without problems in an IPv4-only configuration. 
Recently, I decided I wanted to experiment with running a dual stack 
IPv4 and IPv6 network. I think I have successfully added IPv6 to my 
configuration which seemed to be functioning correctly to the best of 
my knowledge, but I have started to encounter kernel panics that crash 
to the ddb debugger after about 12 hours of uptime. I believe the 
panics might be specifically occurring when the DHCPv6 lease expires or 
renews on my external interface and the IPv6 routing table is updated. 
The panics go away when I reverted back to my IPv4-only configuration 
so I believe the panics are specific to IPv6. I have been 
troubleshooting and researching and for the past week or so and I 
thought I would check to see if some more experienced OpenBSD users 
might have some advice on how to proceed with troubleshooting.

At the end of this email, I have included the relevant section from 
/var/run/dmesg.boot related to one instance of the panic. It includes 
the boot messages, panic message, trace, and the output of ps I ran in 
ddb after the panic.

I did find a similar bug report in the mail archive for openbsd-bugs 
that looks like it was unfortunately not resolved:

https://marc.info/?l=openbsd-bugs=152587044611044=2

Some more info about my configuration that I think may be relevant:

My ISP uses DHCPv6 prefix delegation to distribute IPv6 addresses so I 
used dhcpcd from ports installed using pkg_add. I used rad for router 
advertisements on my internal interfaces and I am using SLAAC to 
auto-configure IPv6 addresses.

My system has 6 Intel NICs, em[0-5], and I also have 4 VLANs vlan[0-3] 
on em5 that I am using for my wireless AP. em0 is the external 
interface connected to my cable modem. em[1-5] are my internal 
interfaces. em[1-3] are each connected to separate computers, em4 is 
unconnected, and em5 and vlan[0-3] are connected to my wireless AP.

The only other observation I had that might be related to the panic is 
regarding the "ndp info overwritten" and "cannot forward src" lines in 
my dmesg. The "ndp info overwritten" lines correspond to interfaces 
that were connected to devices/hosts that powered on at the time of the 
panic while I think that the "cannot forward src" messages correspond 
to em2 and em3 which were connected to computers that were powered off 
at the time of the panic, but were previously powered on to test if 
they were receiving IPv6 addresses. I am wondering if the computers 
being powered off at the time of the DHCPv6 lease update is triggering 
the panic and I think I might test this by only enabling IPv6 on em5 
and vlan[0-3] which are connected to my wireless AP is always on.

I would be happy to provide any other relevant info that I may have 
omitted. Thanks for reading.


Ben

##

OpenBSD 6.6 (GENERIC.MP) #3: Thu Nov 21 03:20:01 MST 2019

r...@syspatch-66-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENER
IC.MP
real mem = 4192743424 (3998MB)
avail mem = 4052963328 (3865MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x8d317000 (85 entries)
bios0: vendor American Megatrends Inc. version "5.12" date 07/08/2019
bios0: Protectli FW6
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP APIC FPDT MCFG SSDT FIDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT SSDT SSDT DBGP DBG2 BGRT DMAR ASF!
acpi0: wakeup devices PS2K(S0) PS2M(S0) RP09(S0) PXSX(S0) RP10(S0) 
PXSX(S0) RP11(S0) PXSX(S0) RP12(S0) PXSX(S0) RP13(S0) PXSX(S0) RP01(S0) 
PXSX(S0) RP02(S0) PXSX(S0) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.58 MHz, 06-8e-09
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,
DS-CPL,VMX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,M
OVBE,POPCNT,DEADLINE,AES,XSAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3
DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,SMEP,ERMS,INVPCID,MPX,RDSEED,SMA
P,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSA
VEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.06 MHz, 06-8e-09
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,
CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,

panic: kernel diagnostic assertion in nd6.c on 6.6-stable amd64

2020-01-04 Thread Ben Lee

Hi,

I have an amd64 system that I am using as a router/firewall for my home 
network running OpenBSD 6.6-stable with the latest syspatches. I have 
been running it without problems in an IPv4-only configuration. 
Recently, I decided I wanted to experiment with running a dual stack 
IPv4 and IPv6 network. I think I have successfully added IPv6 to my 
configuration which seemed to be functioning correctly to the best of my 
knowledge, but I have started to encounter kernel panics that crash to 
the ddb debugger after about 12 hours of uptime. I believe the panics 
might be specifically occurring when the DHCPv6 lease expires or renews 
on my external interface and the IPv6 routing table is updated. The 
panics go away when I reverted back to my IPv4-only configuration so I 
believe the panics are specific to IPv6. I have been troubleshooting and 
researching and for the past week or so and I thought I would check to 
see if some more experienced OpenBSD users might have some advice on how 
to proceed with troubleshooting.


At the end of this email, I have included the relevant section from 
/var/run/dmesg.boot related to one instance of the panic. It includes 
the boot messages, panic message, trace, and the output of ps I ran in 
ddb after the panic.


I did find a similar bug report in the mail archive for openbsd-bugs 
that looks like it was unfortunately not resolved:


https://marc.info/?l=openbsd-bugs=152587044611044=2

Some more info about my configuration that I think may be relevant:

My ISP uses DHCPv6 prefix delegation to distribute IPv6 addresses so I 
used dhcpcd from ports installed using pkg_add. I used rad for router 
advertisements on my internal interfaces and I am using SLAAC to 
auto-configure IPv6 addresses.


My system has 6 Intel NICs, em[0-5], and I also have 4 VLANs vlan[0-3] 
on em5 that I am using for my wireless AP. em0 is the external interface 
connected to my cable modem. em[1-5] are my internal interfaces. em[1-3] 
are each connected to separate computers, em4 is unconnected, and em5 
and vlan[0-3] are connected to my wireless AP.


The only other observation I had that might be related to the panic is 
regarding the "ndp info overwritten" and "cannot forward src" lines in 
my dmesg. The "ndp info overwritten" lines correspond to interfaces that 
were connected to devices/hosts that powered on at the time of the panic 
while I think that the "cannot forward src" messages correspond to em2 
and em3 which were connected to computers that were powered off at the 
time of the panic, but were previously powered on to test if they were 
receiving IPv6 addresses. I am wondering if the computers being powered 
off at the time of the DHCPv6 lease update is triggering the panic and I 
think I might test this by only enabling IPv6 on em5 and vlan[0-3] which 
are connected to my wireless AP is always on.


I would be happy to provide any other relevant info that I may have 
omitted. Thanks for reading.



Ben

##

OpenBSD 6.6 (GENERIC.MP) #3: Thu Nov 21 03:20:01 MST 2019

r...@syspatch-66-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4192743424 (3998MB)
avail mem = 4052963328 (3865MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x8d317000 (85 entries)
bios0: vendor American Megatrends Inc. version "5.12" date 07/08/2019
bios0: Protectli FW6
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP APIC FPDT MCFG SSDT FIDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT SSDT SSDT DBGP DBG2 BGRT DMAR ASF!
acpi0: wakeup devices PS2K(S0) PS2M(S0) RP09(S0) PXSX(S0) RP10(S0) 
PXSX(S0) RP11(S0) PXSX(S0) RP12(S0) PXSX(S0) RP13(S0) PXSX(S0) RP01(S0) 
PXSX(S0) RP02(S0) PXSX(S0) [...]

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.58 MHz, 06-8e-09
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,SMEP,ERMS,INVPCID,MPX,RDSEED,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN

cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz, 1696.06 MHz, 06-8e-09
cpu1: