Re: snmpd in 7.2 dies with too many parse errors

2022-10-31 Thread Ryan Freeman
On Mon, Oct 31, 2022 at 11:05:07AM -0700, Ryan Freeman wrote:
> On Sun, Oct 30, 2022 at 09:21:00AM +0100, Martijn van Duren wrote:
> > On Fri, 2022-10-28 at 13:10 -0700, Ryan Freeman wrote:
> > > On Fri, Oct 28, 2022 at 01:22:57PM +0200, Martijn van Duren wrote:
> > > > I wondered that as well, but I tried to simulate the not found and
> > > > error code-paths, but I couldn't trigger it. So I'm not ruling it
> > > > out, I just can't reproduce it.
> > > > 
> > > > Another thing that's weird is that it looks like the index has been
> > > > stripped from sensorStatus, which might be an indication that
> > > > weird is going on inside libagentx. But like I said: without a
> > > > reproducer I haven't been able to pin it down.
> > > > 
> > > > So the additional verbose information should be useful.
> > > > Come to think of it: The `sysctl hw.sensors` output might be
> > > > helpful as well, both on a succeeding run, as well as at the time
> > > > of the crash (maybe something like:
> > > > `while true; do date; sysctl hw.sensors; sleep 1; done > \
> > > > /path/to/output`)
> > > 
> > > As the offending machines are VMs, hw.sensors actually returns
> > > nothing.  I will send you the output for all of 'hw' key, and
> > > log output for snmpd -vv when the issue arrives.
> > > 
> > > It does seem to coincide with librenms's discovery process, which
> > > comes from librenms upstream as this cron job (on a linux machine):
> > > 33 */6 * * * librenms /opt/librenms/cronic 
> > > /opt/librenms/discovery-wrapper.py 1
> > > 
> > > So, it is the one job running every ~6 hours which would match up with
> > > when snmpd is dying on these OpenBSD 7.2 VMs.  I still have 30+ VMs
> > > on <7.2 that are OK.  Any physical machines I've upgraded to 7.2 are
> > > only at home, not $WORKPLACE where librenms lives.  Not trying to be
> > > noisy, just hopefully narrow down the actual cause :)  Thanks for
> > > the hints!
> > > 
> > > Regards,
> > > -Ryan
> > > 
> > > 
> > I managed to reproduce it with an empty sensors table and doing a
> > getnext request on sensorNumber.0.
> > 
> > The problem was that the internal OID was incremented from from
> > sensorNumber.0 to sensorStatus, which then triggers an endOfMibView.
> > When returning a response this incremented value is then send back to
> > snmpd, while in the case of an endOfMibView it must be the value
> > requested by snmpd (at least for the getnext case, which is what is
> > being used here).
> > 
> > Diff below resets this key on endOfMibView and fixes the problem for
> > me. Can you confirm this?
> > 
> > Assuming this also fixes things for Ryan: OK?
> > 
> > martijn@
> > 
> > Index: agentx.c
> > ===
> > RCS file: /cvs/src/lib/libagentx/agentx.c,v
> > retrieving revision 1.19
> > diff -u -p -r1.19 agentx.c
> > --- agentx.c14 Oct 2022 15:26:58 -  1.19
> > +++ agentx.c30 Oct 2022 08:19:29 -
> > @@ -3426,6 +3426,8 @@ agentx_varbind_endofmibview(struct agent
> > return;
> > }
> >  
> > +   bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> > +   sizeof(axv->axv_start));
> > axv->axv_vb.avb_type = AX_DATA_TYPE_ENDOFMIBVIEW;
> >  
> > if (axv->axv_axo != NULL)
> > 
> 
> Thanks Martijn,
> 
> I applied a slightly offset patch** to a 7.2-stable tree, rebuilt libagentx
> and installed the new libagentx.so.1.0 on an affected host.  snmpd has been
> running for just about 12 hours now, I think this might have solved it.  I
> am going to copy this adjusted libagentx to another host in the mean time,
> and continue watching.
> 
> -Ryan
> 
> **Patch to 7.2-stable:
> 
> Index: agentx.c
> ===
> RCS file: /cvs/src/lib/libagentx/agentx.c,v
> retrieving revision 1.17
> diff -u -p -r1.17 agentx.c
> --- agentx.c  13 Sep 2022 10:20:22 -  1.17
> +++ agentx.c  31 Oct 2022 06:29:45 -
> @@ -3342,6 +3342,8 @@ agentx_varbind_endofmibview(struct agent
>   return;
>   }
>  
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
>   axv->axv_vb.avb_type = AX_DATA_TYPE_ENDOFMIBVIEW;
>  
>   if (axv->axv_axo != NULL)
> 

I can confirm the snmpd process is no-longer disappearing with this
patch.  Almost 24 hours on one VM and 16 hours on another. Thanks!

-Ryan



After upgrade 7.1 -> 7.2 on octeon, anything from ports/packages segfaults

2022-10-31 Thread Sebastian Oswald
>Synopsis:  After upgrading from 7.1 -> 7.2 on octeon, anything from 
>ports/packages segfaults
>Environment:
System  : OpenBSD 7.2
Details : OpenBSD 7.2-current (GENERIC.MP) #1094: Fri Oct
28 18:46:47 MDT 2022
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/GENERIC.MP

Architecture: OpenBSD.octeon
Machine : octeon
>Description:
After upgrading via 'sysupgrade -n' from 7.1 to latest 7.2
snapshot and afterwards running 'sysmerge' and 'pkg_add -u',
any binary installed via pkg segfaults. (e.g. vnstatd,
zabbix_agentd, vim, git, curl)

Packages have been successfully updated by 'pkg_add -u' and show
current versions. E.g. vim-no_x11 was updated to 9.0.0192

When trying to run any binary from /usr/local/[s]bin/ i only get:
# zabbix_agentd
Segmentation fault (core dumped)
# vim
Segmentation fault (core dumped)


example output from gdb for a core file from vim:

# gdb /usr/local/bin/vim vim.core 
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are welcome to change it and/or distribute copies of it under
certain conditions. Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details. This GDB was configured as "mips64-unknown-openbsd7.2"...(no
debugging symbols found)

Core was generated by `vim'.
Program terminated with signal 11, Segmentation fault.
#0  0x000500489fec in ?? ()
(gdb) run
Starting program: /usr/local/bin/vim 

Program received signal SIGSEGV, Segmentation fault.
0x00358e4f9fec in ?? ()


Output looks identical (except for the address) for any
program/corefile ('0x[...]fec in ?? ()'). I've never used gdb, so if I
can provide any other useful informations with it please let me know.

I remember reading something about a new libc version (but can't find
it in the changelogs?), so I suspect there's something foul with the
version on that system vs the one packages are built against?
These versions can be found on that system:
# ls /usr/lib | grep ^libc.so
libc.so.95.1
libc.so.96.0
libc.so.96.1
libc.so.96.3
libc.so.96.4


dmesg output:
[ using 763256 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights
reserved. Copyright (c) 1995-2022 OpenBSD. All rights reserved.
https://www.OpenBSD.org

OpenBSD 7.2-current (GENERIC.MP) #1094: Fri Oct 28 18:46:47 MDT 2022
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/GENERIC.MP
real mem = 1073741824 (1024MB)
avail mem = 1035550720 (987MB)
random: good seed from bootblocks
mainbus0 at root: board 20300 rev 0.15, model cavium,ubnt_e300
cpu0 at mainbus0: CN70xx/CN71xx CPU rev 0.2 1000 MHz, CN70xx/CN71xx FPU
rev 0.0 cpu0: cache L1-I 78KB 39 way D 32KB 32 way, L2 1024KB 8 way
cpu1 at mainbus0: CN70xx/CN71xx CPU rev 0.2 1000 MHz, CN70xx/CN71xx FPU
rev 0.0 cpu1: cache L1-I 78KB 39 way D 32KB 32 way, L2 1024KB 8 way
cpu2 at mainbus0: CN70xx/CN71xx CPU rev 0.2 1000 MHz, CN70xx/CN71xx FPU
rev 0.0 cpu2: cache L1-I 78KB 39 way D 32KB 32 way, L2 1024KB 8 way
cpu3 at mainbus0: CN70xx/CN71xx CPU rev 0.2 1000 MHz, CN70xx/CN71xx FPU
rev 0.0 cpu3: cache L1-I 78KB 39 way D 32KB 32 way, L2 1024KB 8 way
clock0 at mainbus0: int 5
octcrypto0 at mainbus0
iobus0 at mainbus0
simplebus0 at iobus0: "soc"
"bootbus" at simplebus0 not configured
octciu0 at simplebus0
octcib0 at simplebus0: max-bits 23
octcib1 at simplebus0: max-bits 12
octcib2 at simplebus0: max-bits 6
octcib3 at simplebus0: max-bits 15
octcib4 at simplebus0: max-bits 4
octcib5 at simplebus0: max-bits 11
octcib6 at simplebus0: max-bits 11
octgpio0 at simplebus0: 20 pins, xbit 16
octsmi0 at simplebus0
octsmi1 at simplebus0
octpip0 at simplebus0
octgmx0 at octpip0 interface 0
cnmac0 at octgmx0: port 0 SGMII, address 74:83:c2:10:cd:57
ukphy0 at cnmac0 phy 4: Generic IEEE 802.3u media interface, rev. 2:
OUI 0x0001c1, model 0x000c cnmac1 at octgmx0: port 1 SGMII, address
74:83:c2:10:cd:58 ukphy1 at cnmac1 phy 5: Generic IEEE 802.3u media
interface, rev. 2: OUI 0x0001c1, model 0x000c cnmac2 at octgmx0: port 2
SGMII, address 74:83:c2:10:cd:59 ukphy2 at cnmac2 phy 6: Generic IEEE
802.3u media interface, rev. 2: OUI 0x0001c1, model 0x000c cnmac3 at
octgmx0: port 3 SGMII, address 74:83:c2:10:cd:5a ukphy3 at cnmac3 phy
7: Generic IEEE 802.3u media interface, rev. 2: OUI 0x0001c1, model
0x000c octsctl0 at simplebus0: disabled octxctl0 at simplebus0: DWC3
rev 0x250a xhci0 at octxctl0, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev
3.00/1.00 addr 1 octxctl1 at simplebus0: DWC3 rev 0x250a
xhci1 at octxctl1, xHCI 1.0
usb1 at xhci1: USB revision 3.0
uhub1 at usb1 configuration 1 interface 0 "Generic xHCI root hub" rev
3.00/1.00 addr 1 "i2c" at simplebus0 not configured
"i2c" at simplebus0 not configured
com0 at simplebus0: ns16550a, 64 

Re: Uninterruptible D State after ifconfig wg0 destroy

2022-10-31 Thread Sonic
Same thing happened to me.
Found that "reboot -q" worked to get out of it.

On Mon, Oct 31, 2022 at 4:45 PM  wrote:
>
> >Synopsis: MacBook enters uninterruptible D state after command 'ifconfig wg0 
> >destroy'
> >Category: amd64
> >Environment:
> System  : OpenBSD 7.2
> Details : OpenBSD 7.2-current (GENERIC.MP) #509: Wed Oct 26 
> 10:22:18 CDT 2022
>  
> leomor...@leomacbsd.my.domain:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> I use a wg0 interface to connect to a VPN server. When playing with 
> the ifconfig commands
> I found that when I enter the command 'ifconfig wg0 destroy' the 
> computer enters an
> uninterruptible D state. It's then impossible to shutdown or reboot 
> or restart the interface.
> >How-To-Repeat:
> 1. A working wg0 interface.
> 2. Command 'ifconfig wg0 down'
> 3. Command 'ifconfig wg0 destroy'
> >Fix:
>
>
> dmesg:
> OpenBSD 7.2-current (GENERIC.MP) #509: Wed Oct 26 10:22:18 CDT 2022
> leomor...@leomacbsd.my.domain:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 17032024064 (16243MB)
> avail mem = 16498384896 (15734MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x78f8a000 (34 entries)
> bios0: vendor Apple Inc. version "427.140.8.0.0" date 06/13/2021
> bios0: Apple Inc. MacBookPro11,5
> efi0 at bios0: UEFI 1.1
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP HPET APIC SBST ECDT SSDT SSDT SSDT SSDT SSDT SSDT 
> SSDT SSDT SSDT SSDT DMAR MCFG VFCT
> acpi0: wakeup devices PEG0(S3) GFX0(S3) PEG1(S3) PEG2(S3) EC__(S3) GMUX(S3) 
> HDEF(S3) RP03(S4) ARPT(S4) RP04(S4) XHC1(S3) ADP1(S3) LID0(S3)
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpihpet0 at acpi0: 14318179 Hz
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.86 MHz, 06-46-01
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 
> 64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way 
> L4 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.94 MHz, 06-46-01
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 
> 64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way 
> L4 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.94 MHz, 06-46-01
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu2: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 
> 64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way 
> L4 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.94 MHz, 06-46-01
> cpu3: 
> 

Uninterruptible D State after ifconfig wg0 destroy

2022-10-31 Thread leonardo . moreno . urbieta
>Synopsis: MacBook enters uninterruptible D state after command 'ifconfig wg0 
>destroy'
>Category: amd64
>Environment:
System  : OpenBSD 7.2
Details : OpenBSD 7.2-current (GENERIC.MP) #509: Wed Oct 26 
10:22:18 CDT 2022
 
leomor...@leomacbsd.my.domain:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
I use a wg0 interface to connect to a VPN server. When playing with the 
ifconfig commands
I found that when I enter the command 'ifconfig wg0 destroy' the 
computer enters an
uninterruptible D state. It's then impossible to shutdown or reboot or 
restart the interface.   
>How-To-Repeat:
1. A working wg0 interface.
2. Command 'ifconfig wg0 down'
3. Command 'ifconfig wg0 destroy'
>Fix:


dmesg:
OpenBSD 7.2-current (GENERIC.MP) #509: Wed Oct 26 10:22:18 CDT 2022
leomor...@leomacbsd.my.domain:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17032024064 (16243MB)
avail mem = 16498384896 (15734MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x78f8a000 (34 entries)
bios0: vendor Apple Inc. version "427.140.8.0.0" date 06/13/2021
bios0: Apple Inc. MacBookPro11,5
efi0 at bios0: UEFI 1.1
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP HPET APIC SBST ECDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT 
SSDT SSDT SSDT DMAR MCFG VFCT
acpi0: wakeup devices PEG0(S3) GFX0(S3) PEG1(S3) PEG2(S3) EC__(S3) GMUX(S3) 
HDEF(S3) RP03(S4) ARPT(S4) RP04(S4) XHC1(S3) ADP1(S3) LID0(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.86 MHz, 06-46-01
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way L4 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.94 MHz, 06-46-01
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way L4 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.94 MHz, 06-46-01
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way L4 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz, 3491.94 MHz, 06-46-01
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 6MB 64b/line 12-way L3 cache, 128MB 64b/line 16-way L4 cache
cpu3: smt 0, core 3, package 0
cpu4 at mainbus0: apid 1 

Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Mike Larkin
On Mon, Oct 31, 2022 at 07:39:01AM -0500, Scott Cheloha wrote:
> On Mon, Oct 31, 2022 at 12:43:50PM +0100, Paul de Weerd wrote:
> > Hi folks,
> >
> > I just upgraded a VM on my AMD EPYC host.  I get the following
> > protection fault during boot:
> >
> > ddb> bo re
> > rebooting...
> > Using drive 0, partition 3.
> > Loading..
> > probing: pc0 com0 mem[638K 3838M 256M a20=on]
> > disk: hd0+
> > >> OpenBSD/amd64 BOOT 3.55
> > \
> > com0: 115200 baud
> > switching console to com0
> > >> OpenBSD/amd64 BOOT 3.55
> > boot>
> > NOTE: random seed is being reused.
> > booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 
> > [1143945+128+1225080+928182]=0x170d440
> > entry point at 0x81001000
> > [ using 3298368 bytes of bsd ELF symbol table ]
> > Copyright (c) 1982, 1986, 1989, 1991, 1993
> > The Regents of the University of California.  All rights reserved.
> > Copyright (c) 1995-2022 OpenBSD. All rights reserved.  
> > https://www.OpenBSD.org
> >
> > OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> > real mem = 4278177792 (4079MB)
> > avail mem = 4131221504 (3939MB)
> > random: good seed from bootblocks
> > mpath0 at root
> > scsibus0 at mpath0: 256 targets
> > mainbus0 at root
> > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries)
> > bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011
> > bios0: OpenBSD VMM
> > acpi at bios0 not configured
> > cpu0 at mainbus0: (uniprocessor)
> > kernel: protection fault trap, code=0
> > Stopped at  tsc_identify+0xcd:  rdmsr
> > ddb> ps
> >PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
> > *0   0 -1  0  7 0x10200swapper
> > ddb> trace
> > tsc_identify(822c7ff0,822c7ff0,68a34bffd15c67e6,822c7ff0,10,82714c10)
> >  at tsc_identify+0xcd
> > identifycpu(822c7ff0,822c7ff0,bca189629b3de454,8002c400,822c7ff0,8002c424)
> >  at identifycpu+0x2e4
> > cpu_attach(8002c300,8002c400,82714d98,8002c300,980a70616799eafd,8002c300)
> >  at cpu_attach+0x16f
> > config_attach(8002c300,82289250,82714d98,8138d1b0,6c550c45866795b6,82714db8)
> >  at config_attach+0x1f4
> > mainbus_attach(0,8002c300,0,0,819b798732a62156,0) at 
> > mainbus_attach+0x151
> > config_attach(0,822891a8,0,0,6c550c4586f4e2c4,0) at 
> > config_attach+0x1f4
> > cpu_configure(f588b7541b8b8d14,0,0,8002e000,81abb8d3,82714f00)
> >  at cpu_configure+0x33
> > main(0,0,0,0,0,1) at main+0x379
> > end trace frame: 0x0, count: -8
> > ddb> show reg
> > rdi   0x822a3035cpu_vendor+0xd
> > rsi   0x81f04410cmd0646_9_tim_udma+0x170f5
> > rbp   0x82714c30end+0x314c30
> > rbx   0x20202020
> > rdx0
> > rcx   0xc0010015
> > rax0
> > r8 0
> > r9  0x40
> > r10   0x2bc299b68ee7cba5
> > r11   0x75a3a544d54dd7b9
> > r12  0x1
> > r13   0x8002c424
> > r14   0x822c7ff0cpu_info_full_primary+0x1ff0
> > r15   0x82714c40end+0x314c40
> > rip   0x819e1f4dtsc_identify+0xcd
> > cs   0x8
> > rflags   0x10202__ALIGN_SIZE+0xf202
> > rsp   0x82714c10end+0x314c10
> > ss  0x10
> > tsc_identify+0xcd:  rdmsr
> > ddb>
>
> You get a #GP in your VM when trying to rdmsr(MSR_HWCR).  My guess is
> we need to expand the MSR read bitmap for SVM.
>
> This patch compiles, but I can't test it.  Does it fix the panic?
>
> CC dv@ mlarkin@
>
> Index: vmm.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
> retrieving revision 1.323
> diff -u -p -r1.323 vmm.c
> --- vmm.c 7 Sep 2022 18:44:09 -   1.323
> +++ vmm.c 31 Oct 2022 12:38:30 -
> @@ -2705,6 +2705,10 @@ vcpu_reset_regs_svm(struct vcpu *vcpu, s
>   /* allow reading TSC */
>   svm_setmsrbr(vcpu, MSR_TSC);
>
> + /* allow reading HWCR and PSTATEDEF for TSC calibration */
> + svm_setmsrbr(vcpu, MSR_HWCR);
> + svm_setmsrbr(vcpu, MSR_PSTATEDEF(0));
> +
>   /* Guest VCPU ASID */
>   if (vmm_alloc_vpid()) {
>   DPRINTF("%s: could not allocate asid\n", __func__);
>

This is the same diff I would have come up with myself, and since it is reported
to fix the issue, ok mlarkin@ on this. Thanks Scott.

-ml



Re: snmpd in 7.2 dies with too many parse errors

2022-10-31 Thread Ryan Freeman
On Sun, Oct 30, 2022 at 09:21:00AM +0100, Martijn van Duren wrote:
> On Fri, 2022-10-28 at 13:10 -0700, Ryan Freeman wrote:
> > On Fri, Oct 28, 2022 at 01:22:57PM +0200, Martijn van Duren wrote:
> > > I wondered that as well, but I tried to simulate the not found and
> > > error code-paths, but I couldn't trigger it. So I'm not ruling it
> > > out, I just can't reproduce it.
> > > 
> > > Another thing that's weird is that it looks like the index has been
> > > stripped from sensorStatus, which might be an indication that
> > > weird is going on inside libagentx. But like I said: without a
> > > reproducer I haven't been able to pin it down.
> > > 
> > > So the additional verbose information should be useful.
> > > Come to think of it: The `sysctl hw.sensors` output might be
> > > helpful as well, both on a succeeding run, as well as at the time
> > > of the crash (maybe something like:
> > > `while true; do date; sysctl hw.sensors; sleep 1; done > \
> > > /path/to/output`)
> > 
> > As the offending machines are VMs, hw.sensors actually returns
> > nothing.  I will send you the output for all of 'hw' key, and
> > log output for snmpd -vv when the issue arrives.
> > 
> > It does seem to coincide with librenms's discovery process, which
> > comes from librenms upstream as this cron job (on a linux machine):
> > 33 */6 * * * librenms /opt/librenms/cronic 
> > /opt/librenms/discovery-wrapper.py 1
> > 
> > So, it is the one job running every ~6 hours which would match up with
> > when snmpd is dying on these OpenBSD 7.2 VMs.  I still have 30+ VMs
> > on <7.2 that are OK.  Any physical machines I've upgraded to 7.2 are
> > only at home, not $WORKPLACE where librenms lives.  Not trying to be
> > noisy, just hopefully narrow down the actual cause :)  Thanks for
> > the hints!
> > 
> > Regards,
> > -Ryan
> > 
> > 
> I managed to reproduce it with an empty sensors table and doing a
> getnext request on sensorNumber.0.
> 
> The problem was that the internal OID was incremented from from
> sensorNumber.0 to sensorStatus, which then triggers an endOfMibView.
> When returning a response this incremented value is then send back to
> snmpd, while in the case of an endOfMibView it must be the value
> requested by snmpd (at least for the getnext case, which is what is
> being used here).
> 
> Diff below resets this key on endOfMibView and fixes the problem for
> me. Can you confirm this?
> 
> Assuming this also fixes things for Ryan: OK?
> 
> martijn@
> 
> Index: agentx.c
> ===
> RCS file: /cvs/src/lib/libagentx/agentx.c,v
> retrieving revision 1.19
> diff -u -p -r1.19 agentx.c
> --- agentx.c  14 Oct 2022 15:26:58 -  1.19
> +++ agentx.c  30 Oct 2022 08:19:29 -
> @@ -3426,6 +3426,8 @@ agentx_varbind_endofmibview(struct agent
>   return;
>   }
>  
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
>   axv->axv_vb.avb_type = AX_DATA_TYPE_ENDOFMIBVIEW;
>  
>   if (axv->axv_axo != NULL)
> 

Thanks Martijn,

I applied a slightly offset patch** to a 7.2-stable tree, rebuilt libagentx
and installed the new libagentx.so.1.0 on an affected host.  snmpd has been
running for just about 12 hours now, I think this might have solved it.  I
am going to copy this adjusted libagentx to another host in the mean time,
and continue watching.

-Ryan

**Patch to 7.2-stable:

Index: agentx.c
===
RCS file: /cvs/src/lib/libagentx/agentx.c,v
retrieving revision 1.17
diff -u -p -r1.17 agentx.c
--- agentx.c13 Sep 2022 10:20:22 -  1.17
+++ agentx.c31 Oct 2022 06:29:45 -
@@ -3342,6 +3342,8 @@ agentx_varbind_endofmibview(struct agent
return;
}
 
+   bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
+   sizeof(axv->axv_start));
axv->axv_vb.avb_type = AX_DATA_TYPE_ENDOFMIBVIEW;
 
if (axv->axv_axo != NULL)



Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Jesper Wallin
Hi

I had the same issue on my laptop (AMD Ryzen 7 PRO 3700U) and the patch
solved it on my machine at least.


Jesper Wallin

On Mon, Oct 31, 2022 at 02:15:00PM +0100, Paul de Weerd wrote:
> On Mon, Oct 31, 2022 at 07:39:01AM -0500, Scott Cheloha wrote:
> | You get a #GP in your VM when trying to rdmsr(MSR_HWCR).  My guess is
> | we need to expand the MSR read bitmap for SVM.
> | 
> | This patch compiles, but I can't test it.  Does it fix the panic?
> 
> To test this patch, I'd have to upgrade the hypervisor.  That's a bit
> more involved, I'll plan it ASAP and report back, but it may be a few
> days.
> 
> Thank you Scott and Mike!
> 
> Paul
> 
> | CC dv@ mlarkin@
> | 
> | Index: vmm.c
> | ===
> | RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
> | retrieving revision 1.323
> | diff -u -p -r1.323 vmm.c
> | --- vmm.c   7 Sep 2022 18:44:09 -   1.323
> | +++ vmm.c   31 Oct 2022 12:38:30 -
> | @@ -2705,6 +2705,10 @@ vcpu_reset_regs_svm(struct vcpu *vcpu, s
> | /* allow reading TSC */
> | svm_setmsrbr(vcpu, MSR_TSC);
> |  
> | +   /* allow reading HWCR and PSTATEDEF for TSC calibration */
> | +   svm_setmsrbr(vcpu, MSR_HWCR);
> | +   svm_setmsrbr(vcpu, MSR_PSTATEDEF(0));
> | +
> | /* Guest VCPU ASID */
> | if (vmm_alloc_vpid()) {
> | DPRINTF("%s: could not allocate asid\n", __func__);
> | 
> 
> -- 
> >[<++>-]<+++.>+++[<-->-]<.>+++[<+
> +++>-]<.>++[<>-]<+.--.[-]
>  http://www.weirdnet.nl/ 
> 



Re: riscv64 OpenBSD 7.2 packages are not found at expected URL (typo?)

2022-10-31 Thread Jeremie Courreges-Anglas
On Mon, Oct 31 2022, Miguel Landaeta  wrote:
>>Synopsis: riscv64 OpenBSD 7.2 packages are not found at expected URL
>>Category: riscv64
>>Environment:
> System  : OpenBSD 7.2
> Details : OpenBSD 7.2 (GENERIC.MP) #188: Wed Sep 28 04:06:11 MDT 2022
> dera...@riscv64.openbsd.org:/usr/src/sys/arch/riscv64/compile/GENERIC.MP
>
> Architecture: OpenBSD.riscv64
> Machine : riscv64
>>Description:
> pkg_add fails with 404 on riscv64 systems running OpenBSD 7.2
>>How-To-Repeat:
> Just try to install any package, e.g.:
> florence$ doas pkg_add -v -v -v rsync
> https://cdn.openbsd.org/pub/OpenBSD/7.2/packages/riscv64/: no such dir
> Can't find rsync
> Can't load quirk: Can't locate OpenBSD/Quirks.pm in @INC (you may need
> to install the OpenBSD::Quirks module) (@INC contains:
> /usr/local/libdata/perl5/site_perl
> /usr/local/libdata/perl5/site_perl/riscv64-openbsd
> /usr/libdata/perl5/riscv64-openbsd /usr/libdata/perl5) at
> /usr/libdata/perl5/OpenBSD/AddDelete.pm line 347.
>
>
>>Fix:
> I guess the proper fix should be to fix the URL in the mirrors, for
> now you have to workaround the issue by indicating the URL that is
> currently available in the mirrors
> (https://cdn.openbsd.org/pub/OpenBSD/7.2/packages/risvc64/), e.g.:

Some attempts to fix this have paid, for
example you might fetch from the correct url at
https://ftp.hostserver.de/pub/OpenBSD/7.2/packages/riscv64/ or
https://ftp.fr.openbsd.org/pub/OpenBSD/7.2/packages/riscv64/

Some more repairs may be needed.  Thanks for the heads-up!

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Paul de Weerd
On Mon, Oct 31, 2022 at 07:39:01AM -0500, Scott Cheloha wrote:
| You get a #GP in your VM when trying to rdmsr(MSR_HWCR).  My guess is
| we need to expand the MSR read bitmap for SVM.
| 
| This patch compiles, but I can't test it.  Does it fix the panic?

To test this patch, I'd have to upgrade the hypervisor.  That's a bit
more involved, I'll plan it ASAP and report back, but it may be a few
days.

Thank you Scott and Mike!

Paul

| CC dv@ mlarkin@
| 
| Index: vmm.c
| ===
| RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
| retrieving revision 1.323
| diff -u -p -r1.323 vmm.c
| --- vmm.c 7 Sep 2022 18:44:09 -   1.323
| +++ vmm.c 31 Oct 2022 12:38:30 -
| @@ -2705,6 +2705,10 @@ vcpu_reset_regs_svm(struct vcpu *vcpu, s
|   /* allow reading TSC */
|   svm_setmsrbr(vcpu, MSR_TSC);
|  
| + /* allow reading HWCR and PSTATEDEF for TSC calibration */
| + svm_setmsrbr(vcpu, MSR_HWCR);
| + svm_setmsrbr(vcpu, MSR_PSTATEDEF(0));
| +
|   /* Guest VCPU ASID */
|   if (vmm_alloc_vpid()) {
|   DPRINTF("%s: could not allocate asid\n", __func__);
| 

-- 
>[<++>-]<+++.>+++[<-->-]<.>+++[<+
+++>-]<.>++[<>-]<+.--.[-]
 http://www.weirdnet.nl/ 



Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Scott Cheloha
On Mon, Oct 31, 2022 at 12:43:50PM +0100, Paul de Weerd wrote:
> Hi folks,
> 
> I just upgraded a VM on my AMD EPYC host.  I get the following
> protection fault during boot:
> 
> ddb> bo re
> rebooting...
> Using drive 0, partition 3.
> Loading..
> probing: pc0 com0 mem[638K 3838M 256M a20=on] 
> disk: hd0+
> >> OpenBSD/amd64 BOOT 3.55
> \
> com0: 115200 baud
> switching console to com0
> >> OpenBSD/amd64 BOOT 3.55
> boot> 
> NOTE: random seed is being reused.
> booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 
> [1143945+128+1225080+928182]=0x170d440
> entry point at 0x81001000
> [ using 3298368 bytes of bsd ELF symbol table ]
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2022 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> real mem = 4278177792 (4079MB)
> avail mem = 4131221504 (3939MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries)
> bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011
> bios0: OpenBSD VMM
> acpi at bios0 not configured
> cpu0 at mainbus0: (uniprocessor)
> kernel: protection fault trap, code=0
> Stopped at  tsc_identify+0xcd:  rdmsr
> ddb> ps
>PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
> *0   0 -1  0  7 0x10200swapper
> ddb> trace
> tsc_identify(822c7ff0,822c7ff0,68a34bffd15c67e6,822c7ff0,10,82714c10)
>  at tsc_identify+0xcd
> identifycpu(822c7ff0,822c7ff0,bca189629b3de454,8002c400,822c7ff0,8002c424)
>  at identifycpu+0x2e4
> cpu_attach(8002c300,8002c400,82714d98,8002c300,980a70616799eafd,8002c300)
>  at cpu_attach+0x16f
> config_attach(8002c300,82289250,82714d98,8138d1b0,6c550c45866795b6,82714db8)
>  at config_attach+0x1f4
> mainbus_attach(0,8002c300,0,0,819b798732a62156,0) at 
> mainbus_attach+0x151
> config_attach(0,822891a8,0,0,6c550c4586f4e2c4,0) at 
> config_attach+0x1f4
> cpu_configure(f588b7541b8b8d14,0,0,8002e000,81abb8d3,82714f00)
>  at cpu_configure+0x33
> main(0,0,0,0,0,1) at main+0x379
> end trace frame: 0x0, count: -8
> ddb> show reg
> rdi   0x822a3035cpu_vendor+0xd
> rsi   0x81f04410cmd0646_9_tim_udma+0x170f5
> rbp   0x82714c30end+0x314c30
> rbx   0x20202020
> rdx0
> rcx   0xc0010015
> rax0
> r8 0
> r9  0x40
> r10   0x2bc299b68ee7cba5
> r11   0x75a3a544d54dd7b9
> r12  0x1
> r13   0x8002c424
> r14   0x822c7ff0cpu_info_full_primary+0x1ff0
> r15   0x82714c40end+0x314c40
> rip   0x819e1f4dtsc_identify+0xcd
> cs   0x8
> rflags   0x10202__ALIGN_SIZE+0xf202
> rsp   0x82714c10end+0x314c10
> ss  0x10
> tsc_identify+0xcd:  rdmsr
> ddb> 

You get a #GP in your VM when trying to rdmsr(MSR_HWCR).  My guess is
we need to expand the MSR read bitmap for SVM.

This patch compiles, but I can't test it.  Does it fix the panic?

CC dv@ mlarkin@

Index: vmm.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
retrieving revision 1.323
diff -u -p -r1.323 vmm.c
--- vmm.c   7 Sep 2022 18:44:09 -   1.323
+++ vmm.c   31 Oct 2022 12:38:30 -
@@ -2705,6 +2705,10 @@ vcpu_reset_regs_svm(struct vcpu *vcpu, s
/* allow reading TSC */
svm_setmsrbr(vcpu, MSR_TSC);
 
+   /* allow reading HWCR and PSTATEDEF for TSC calibration */
+   svm_setmsrbr(vcpu, MSR_HWCR);
+   svm_setmsrbr(vcpu, MSR_PSTATEDEF(0));
+
/* Guest VCPU ASID */
if (vmm_alloc_vpid()) {
DPRINTF("%s: could not allocate asid\n", __func__);



Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Mike Larkin
On Mon, Oct 31, 2022 at 01:14:59PM +0100, Paul de Weerd wrote:
> Had some time over lunch.  Disabling the code path that calls
> tsc_freq_msr lets me boot into the VM again:
>
> Index: tsc.c
> ===
> RCS file: /home/OpenBSD/cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.30
> diff -u -p -r1.30 tsc.c
> --- tsc.c 24 Oct 2022 00:56:33 -  1.30
> +++ tsc.c 31 Oct 2022 12:12:32 -
> @@ -179,7 +179,7 @@ tsc_identify(struct cpu_info *ci)
>   tsc_is_invariant = 1;
>
>   tsc_frequency = tsc_freq_cpuid(ci);
> - if (tsc_frequency == 0)
> + if (tsc_frequency == 42)
>   tsc_frequency = tsc_freq_msr(ci);
>   if (tsc_frequency > 0)
>   delay_init(tsc_delay, 5000);
>
> Obviously not a fix, but at least a smoking gun.
>
> Paul
>

The tsc freq msr probably needs to be passed through in this case. I'll take
a look.

-ml

> On Mon, Oct 31, 2022 at 12:43:50PM +0100, Paul de Weerd wrote:
> | Hi folks,
> |
> | I just upgraded a VM on my AMD EPYC host.  I get the following
> | protection fault during boot:
> |
> | ddb> bo re
> | rebooting...
> | Using drive 0, partition 3.
> | Loading..
> | probing: pc0 com0 mem[638K 3838M 256M a20=on]
> | disk: hd0+
> | >> OpenBSD/amd64 BOOT 3.55
> | \
> | com0: 115200 baud
> | switching console to com0
> | >> OpenBSD/amd64 BOOT 3.55
> | boot>
> | NOTE: random seed is being reused.
> | booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 
> [1143945+128+1225080+928182]=0x170d440
> | entry point at 0x81001000
> | [ using 3298368 bytes of bsd ELF symbol table ]
> | Copyright (c) 1982, 1986, 1989, 1991, 1993
> | The Regents of the University of California.  All rights reserved.
> | Copyright (c) 1995-2022 OpenBSD. All rights reserved.  
> https://www.OpenBSD.org
> |
> | OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022
> | dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> | real mem = 4278177792 (4079MB)
> | avail mem = 4131221504 (3939MB)
> | random: good seed from bootblocks
> | mpath0 at root
> | scsibus0 at mpath0: 256 targets
> | mainbus0 at root
> | bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries)
> | bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011
> | bios0: OpenBSD VMM
> | acpi at bios0 not configured
> | cpu0 at mainbus0: (uniprocessor)
> | kernel: protection fault trap, code=0
> | Stopped at  tsc_identify+0xcd:  rdmsr
> | ddb> ps
> |PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
> | *0   0 -1  0  7 0x10200swapper
> | ddb> trace
> | 
> tsc_identify(822c7ff0,822c7ff0,68a34bffd15c67e6,822c7ff0,10,82714c10)
>  at tsc_identify+0xcd
> | 
> identifycpu(822c7ff0,822c7ff0,bca189629b3de454,8002c400,822c7ff0,8002c424)
>  at identifycpu+0x2e4
> | 
> cpu_attach(8002c300,8002c400,82714d98,8002c300,980a70616799eafd,8002c300)
>  at cpu_attach+0x16f
> | 
> config_attach(8002c300,82289250,82714d98,8138d1b0,6c550c45866795b6,82714db8)
>  at config_attach+0x1f4
> | mainbus_attach(0,8002c300,0,0,819b798732a62156,0) at 
> mainbus_attach+0x151
> | config_attach(0,822891a8,0,0,6c550c4586f4e2c4,0) at 
> config_attach+0x1f4
> | 
> cpu_configure(f588b7541b8b8d14,0,0,8002e000,81abb8d3,82714f00)
>  at cpu_configure+0x33
> | main(0,0,0,0,0,1) at main+0x379
> | end trace frame: 0x0, count: -8
> | ddb> show reg
> | rdi   0x822a3035cpu_vendor+0xd
> | rsi   0x81f04410cmd0646_9_tim_udma+0x170f5
> | rbp   0x82714c30end+0x314c30
> | rbx   0x20202020
> | rdx0
> | rcx   0xc0010015
> | rax0
> | r8 0
> | r9  0x40
> | r10   0x2bc299b68ee7cba5
> | r11   0x75a3a544d54dd7b9
> | r12  0x1
> | r13   0x8002c424
> | r14   0x822c7ff0cpu_info_full_primary+0x1ff0
> | r15   0x82714c40end+0x314c40
> | rip   0x819e1f4dtsc_identify+0xcd
> | cs   0x8
> | rflags   0x10202__ALIGN_SIZE+0xf202
> | rsp   0x82714c10end+0x314c10
> | ss  0x10
> | tsc_identify+0xcd:  rdmsr
> | ddb>
> |
> | When trying to boot bsd.rd I get:
> |
> | fatal protection fault in supervisor mode
> | trap type 4 code  rip 811d5fb2 cs 8 rflags 10202 cr2 0 cpl 
> e rsp 81a06d10
> | gsbase 0x818f5ff0  kgsbase 0x0
> | panic: trap type 4, code=, pc=811d5fb2
> |
> | This snapshot works 

Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Paul de Weerd
Had some time over lunch.  Disabling the code path that calls
tsc_freq_msr lets me boot into the VM again:

Index: tsc.c
===
RCS file: /home/OpenBSD/cvs/src/sys/arch/amd64/amd64/tsc.c,v
retrieving revision 1.30
diff -u -p -r1.30 tsc.c
--- tsc.c   24 Oct 2022 00:56:33 -  1.30
+++ tsc.c   31 Oct 2022 12:12:32 -
@@ -179,7 +179,7 @@ tsc_identify(struct cpu_info *ci)
tsc_is_invariant = 1;
 
tsc_frequency = tsc_freq_cpuid(ci);
-   if (tsc_frequency == 0)
+   if (tsc_frequency == 42)
tsc_frequency = tsc_freq_msr(ci);
if (tsc_frequency > 0)
delay_init(tsc_delay, 5000);

Obviously not a fix, but at least a smoking gun.

Paul

On Mon, Oct 31, 2022 at 12:43:50PM +0100, Paul de Weerd wrote:
| Hi folks,
| 
| I just upgraded a VM on my AMD EPYC host.  I get the following
| protection fault during boot:
| 
| ddb> bo re
| rebooting...
| Using drive 0, partition 3.
| Loading..
| probing: pc0 com0 mem[638K 3838M 256M a20=on] 
| disk: hd0+
| >> OpenBSD/amd64 BOOT 3.55
| \
| com0: 115200 baud
| switching console to com0
| >> OpenBSD/amd64 BOOT 3.55
| boot> 
| NOTE: random seed is being reused.
| booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 
[1143945+128+1225080+928182]=0x170d440
| entry point at 0x81001000
| [ using 3298368 bytes of bsd ELF symbol table ]
| Copyright (c) 1982, 1986, 1989, 1991, 1993
| The Regents of the University of California.  All rights reserved.
| Copyright (c) 1995-2022 OpenBSD. All rights reserved.  https://www.OpenBSD.org
| 
| OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022
| dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
| real mem = 4278177792 (4079MB)
| avail mem = 4131221504 (3939MB)
| random: good seed from bootblocks
| mpath0 at root
| scsibus0 at mpath0: 256 targets
| mainbus0 at root
| bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries)
| bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011
| bios0: OpenBSD VMM
| acpi at bios0 not configured
| cpu0 at mainbus0: (uniprocessor)
| kernel: protection fault trap, code=0
| Stopped at  tsc_identify+0xcd:  rdmsr
| ddb> ps
|PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
| *0   0 -1  0  7 0x10200swapper
| ddb> trace
| 
tsc_identify(822c7ff0,822c7ff0,68a34bffd15c67e6,822c7ff0,10,82714c10)
 at tsc_identify+0xcd
| 
identifycpu(822c7ff0,822c7ff0,bca189629b3de454,8002c400,822c7ff0,8002c424)
 at identifycpu+0x2e4
| 
cpu_attach(8002c300,8002c400,82714d98,8002c300,980a70616799eafd,8002c300)
 at cpu_attach+0x16f
| 
config_attach(8002c300,82289250,82714d98,8138d1b0,6c550c45866795b6,82714db8)
 at config_attach+0x1f4
| mainbus_attach(0,8002c300,0,0,819b798732a62156,0) at 
mainbus_attach+0x151
| config_attach(0,822891a8,0,0,6c550c4586f4e2c4,0) at 
config_attach+0x1f4
| 
cpu_configure(f588b7541b8b8d14,0,0,8002e000,81abb8d3,82714f00)
 at cpu_configure+0x33
| main(0,0,0,0,0,1) at main+0x379
| end trace frame: 0x0, count: -8
| ddb> show reg
| rdi   0x822a3035cpu_vendor+0xd
| rsi   0x81f04410cmd0646_9_tim_udma+0x170f5
| rbp   0x82714c30end+0x314c30
| rbx   0x20202020
| rdx0
| rcx   0xc0010015
| rax0
| r8 0
| r9  0x40
| r10   0x2bc299b68ee7cba5
| r11   0x75a3a544d54dd7b9
| r12  0x1
| r13   0x8002c424
| r14   0x822c7ff0cpu_info_full_primary+0x1ff0
| r15   0x82714c40end+0x314c40
| rip   0x819e1f4dtsc_identify+0xcd
| cs   0x8
| rflags   0x10202__ALIGN_SIZE+0xf202
| rsp   0x82714c10end+0x314c10
| ss  0x10
| tsc_identify+0xcd:  rdmsr
| ddb> 
| 
| When trying to boot bsd.rd I get:
| 
| fatal protection fault in supervisor mode
| trap type 4 code  rip 811d5fb2 cs 8 rflags 10202 cr2 0 cpl e 
rsp 81a06d10
| gsbase 0x818f5ff0  kgsbase 0x0
| panic: trap type 4, code=, pc=811d5fb2
| 
| This snapshot works fine in VMs running on my old Intel-based
| workstation, so I suspect the AMD CPU may have something to do with
| it.  Included below is the dmesg of the hypervisor (yes, that should
| also be upgraded at some point...).
| 
| I still have an old bsd.rd that I can boot into from the previous
| snapshot:
| 
| OpenBSD 7.2 (RAMDISK_CD) #715: Thu Sep 22 11:51:48 MDT 2022
| 
| 

kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

2022-10-31 Thread Paul de Weerd
Hi folks,

I just upgraded a VM on my AMD EPYC host.  I get the following
protection fault during boot:

ddb> bo re
rebooting...
Using drive 0, partition 3.
Loading..
probing: pc0 com0 mem[638K 3838M 256M a20=on] 
disk: hd0+
>> OpenBSD/amd64 BOOT 3.55
\
com0: 115200 baud
switching console to com0
>> OpenBSD/amd64 BOOT 3.55
boot> 
NOTE: random seed is being reused.
booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 
[1143945+128+1225080+928182]=0x170d440
entry point at 0x81001000
[ using 3298368 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2022 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 4278177792 (4079MB)
avail mem = 4131221504 (3939MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries)
bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011
bios0: OpenBSD VMM
acpi at bios0 not configured
cpu0 at mainbus0: (uniprocessor)
kernel: protection fault trap, code=0
Stopped at  tsc_identify+0xcd:  rdmsr
ddb> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
*0   0 -1  0  7 0x10200swapper
ddb> trace
tsc_identify(822c7ff0,822c7ff0,68a34bffd15c67e6,822c7ff0,10,82714c10)
 at tsc_identify+0xcd
identifycpu(822c7ff0,822c7ff0,bca189629b3de454,8002c400,822c7ff0,8002c424)
 at identifycpu+0x2e4
cpu_attach(8002c300,8002c400,82714d98,8002c300,980a70616799eafd,8002c300)
 at cpu_attach+0x16f
config_attach(8002c300,82289250,82714d98,8138d1b0,6c550c45866795b6,82714db8)
 at config_attach+0x1f4
mainbus_attach(0,8002c300,0,0,819b798732a62156,0) at 
mainbus_attach+0x151
config_attach(0,822891a8,0,0,6c550c4586f4e2c4,0) at config_attach+0x1f4
cpu_configure(f588b7541b8b8d14,0,0,8002e000,81abb8d3,82714f00)
 at cpu_configure+0x33
main(0,0,0,0,0,1) at main+0x379
end trace frame: 0x0, count: -8
ddb> show reg
rdi   0x822a3035cpu_vendor+0xd
rsi   0x81f04410cmd0646_9_tim_udma+0x170f5
rbp   0x82714c30end+0x314c30
rbx   0x20202020
rdx0
rcx   0xc0010015
rax0
r8 0
r9  0x40
r10   0x2bc299b68ee7cba5
r11   0x75a3a544d54dd7b9
r12  0x1
r13   0x8002c424
r14   0x822c7ff0cpu_info_full_primary+0x1ff0
r15   0x82714c40end+0x314c40
rip   0x819e1f4dtsc_identify+0xcd
cs   0x8
rflags   0x10202__ALIGN_SIZE+0xf202
rsp   0x82714c10end+0x314c10
ss  0x10
tsc_identify+0xcd:  rdmsr
ddb> 

When trying to boot bsd.rd I get:

fatal protection fault in supervisor mode
trap type 4 code  rip 811d5fb2 cs 8 rflags 10202 cr2 0 cpl e 
rsp 81a06d10
gsbase 0x818f5ff0  kgsbase 0x0
panic: trap type 4, code=, pc=811d5fb2

This snapshot works fine in VMs running on my old Intel-based
workstation, so I suspect the AMD CPU may have something to do with
it.  Included below is the dmesg of the hypervisor (yes, that should
also be upgraded at some point...).

I still have an old bsd.rd that I can boot into from the previous
snapshot:

OpenBSD 7.2 (RAMDISK_CD) #715: Thu Sep 22 11:51:48 MDT 2022

Looking at CVS history between Sep 22 and today, this commit from
Scott sticks out (hence the CC: to cheloha@):

https://marc.info/?l=openbsd-cvs=166657262528344=2

Later tonight I can try reverting this commit to see if it helps
things.  Will follow up when there's something to report.

Cheers,

Paul

--- dmesg (of the hypervisor) 
OpenBSD 7.1 (GENERIC.MP) #465: Mon Apr 11 18:03:57 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 68567597056 (65391MB)
avail mem = 66472255488 (63392MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xdab19000 (51 entries)
bios0: vendor American Megatrends Inc. version "1.0c" date 06/30/2020
bios0: Supermicro Super Server
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP APIC FPDT FIDT SSDT SPMI SSDT MCFG SSDT CRAT CDIT BERT 
EINJ HEST HPET SSDT UEFI IVRS SSDT WSMT
acpi0: wakeup devices 

riscv64 OpenBSD 7.2 packages are not found at expected URL (typo?)

2022-10-31 Thread Miguel Landaeta
>Synopsis: riscv64 OpenBSD 7.2 packages are not found at expected URL
>Category: riscv64
>Environment:
System  : OpenBSD 7.2
Details : OpenBSD 7.2 (GENERIC.MP) #188: Wed Sep 28 04:06:11 MDT 2022
dera...@riscv64.openbsd.org:/usr/src/sys/arch/riscv64/compile/GENERIC.MP

Architecture: OpenBSD.riscv64
Machine : riscv64
>Description:
pkg_add fails with 404 on riscv64 systems running OpenBSD 7.2
>How-To-Repeat:
Just try to install any package, e.g.:
florence$ doas pkg_add -v -v -v rsync
https://cdn.openbsd.org/pub/OpenBSD/7.2/packages/riscv64/: no such dir
Can't find rsync
Can't load quirk: Can't locate OpenBSD/Quirks.pm in @INC (you may need
to install the OpenBSD::Quirks module) (@INC contains:
/usr/local/libdata/perl5/site_perl
/usr/local/libdata/perl5/site_perl/riscv64-openbsd
/usr/libdata/perl5/riscv64-openbsd /usr/libdata/perl5) at
/usr/libdata/perl5/OpenBSD/AddDelete.pm line 347.


>Fix:
I guess the proper fix should be to fix the URL in the mirrors, for
now you have to workaround the issue by indicating the URL that is
currently available in the mirrors
(https://cdn.openbsd.org/pub/OpenBSD/7.2/packages/risvc64/), e.g.:

florence$ doas pkg_add -v -v -v
https://cdn.openbsd.org/pub/OpenBSD/7.2/packages/risvc64/rsync
https://cdn.openbsd.org/pub/OpenBSD/7.2/packages/riscv64/: no such dir
Ambiguous: choose package for rsync
a   0: 
1: rsync-3.2.5pl0
2: rsync-3.2.5pl0-iconv
Your choice: 1
parsing rsync-3.2.5pl0
Can't load quirk: Can't locate OpenBSD/Quirks.pm in @INC (you may need
to install the OpenBSD::Quirks module) (@INC contains:
/usr/local/libdata/perl5/site_perl
/usr/local/libdata/perl5/site_perl/riscv64-openbsd
/usr/libdata/perl5/riscv64-openbsd /usr/libdata/perl5) at
/usr/libdata/perl5/OpenBSD/AddDelete.pm line 347.
found libspec c.96.2 in /usr/lib
found libspec crypto.50.0 in /usr/lib
extract bin/rsync -> /usr/local/bin/rsync
extract man/man1/rrsync.1 -> /usr/local/man/man1/rrsync.1
extract man/man1/rsync-ssl.1 -> /usr/local/man/man1/rsync-ssl.1
extract man/man1/rsync.1 -> /usr/local/man/man1/rsync.1
extract man/man5/rsyncd.conf.5 -> /usr/local/man/man5/rsyncd.conf.5
extract /etc/rc.d/rsyncd -> /etc/rc.d/rsyncd
extract bin/rrsync -> /usr/local/bin/rrsync
extract bin/rsync-ssl -> /usr/local/bin/rsync-ssl
extract share/doc/rsync/tech_report.tex ->
/usr/local/share/doc/rsync/tech_report.tex
adding group _rsync
Running /usr/sbin/groupadd -v -g 669 -- _rsync
adding user _rsync
Running /usr/sbin/useradd -v -u 669 -g _rsync -L daemon -c rsync
Daemon -d /var/empty -s /sbin/nologin -- _rsync
rsync-3.2.5pl0: ok
The following new rcscripts were installed: /etc/rc.d/rsyncd
See rcctl(8) for details.
Running /usr/sbin/makewhatis -d /usr/local/man --
/usr/local/man/man1/rrsync.1 /usr/local/man/man1/rsync-ssl.1
/usr/local/man/man1/rsync.1 /usr/local/man/man5/rsyncd.conf.5
/dev/sd0a on /: 97 bytes
/dev/sd0h on /usr/local: 2513253 bytes
florence$


-- 
Miguel Landaeta, miguel at miguel.cc
secure email with PGP 0x6E608B637D8967E9 available at http://keyserver.pgp.com/
"Faith means not wanting to know what is true." -- Nietzsche



Re: Asynchronous wait on fence

2022-10-31 Thread Stuart Henderson
On 2022/10/30 18:22, anointedfig wrote:
> Hi,
> 
> I am new to OpenBSD. X crashed with the following output:
> 
> Asynchronous wait on fence :Xorg[30301]:375f timed out 
> (hint:0x81dc8ab0s)

https://www.openbsd.org/report.html shows the sort of information
you need to include in a report for anyone to be able to help.



ahci_get_err_ccb but SACT

2022-10-31 Thread Rafael Sadowski
I have been seeing this error for a very long time (1-2+ years).

How do I get this error:

1. Put the system into suspend (deep sleep) state.
2. Weak up (Everything works)
3. When I now want to shut down the computer I get the following error
message:

ahci_get_err_ccb but SACT 0800 != 0?
ahci_put_err_ccb but SACT 0800 != 0?
ahci_get_err_ccb but SACT 0800 != 0?
ahci_put_err_ccb but SACT 0800 != 0?
ahci_get_err_ccb but SACT 0800 != 0?
ahci_put_err_ccb but SACT 0800 != 0?
ahci_get_err_ccb but SACT 0800 != 0?
ahci_put_err_ccb but SACT 0800 != 0?
ahci_get_err_ccb but SACT 0800 != 0?
ahci_put_err_ccb but SACT 0800 != 0?
ahci_get_err_ccb but SACT 1000 != 0?
ahci_put_err_ccb but SACT 1000 != 0?
ahci_get_err_ccb but SACT 1000 != 0?
ahci_put_err_ccb but SACT 1000 != 0?
ahci_get_err_ccb but SACT 1000 != 0?
ahci_put_err_ccb but SACT 1000 != 0?
ahci_get_err_ccb but SACT 1000 != 0?
ahci_put_err_ccb but SACT 1000 != 0?
ahci_get_err_ccb but SACT 1000 != 0?
ahci_put_err_ccb but SACT 1000 != 0?
ahci_get_err_ccb but SACT 2000 != 0?
ahci_put_err_ccb but SACT 2000 != 0?
ahci_get_err_ccb but SACT 2000 != 0?
ahci_put_err_ccb but SACT 2000 != 0?
ahci_get_err_ccb but SACT 2000 != 0?
ahci_put_err_ccb but SACT 2000 != 0?
ahci_get_err_ccb but SACT 2000 != 0?
ahci_put_err_ccb but SACT 2000 != 0?
ahci_get_err_ccb but SACT 2000 != 0?
ahci_put_err_ccb but SACT 2000 != 0?
ahci_get_err_ccb but SACT 4000 != 0?
ahci_put_err_ccb but SACT 4000 != 0?
ahci_get_err_ccb but SACT 4000 != 0?
ahci_put_err_ccb but SACT 4000 != 0?
ahci_get_err_ccb but SACT 4000 != 0?
ahci_put_err_ccb but SACT 4000 != 0?
ahci_get_err_ccb but SACT 4000 != 0?
ahci_put_err_ccb but SACT 4000 != 0?
ahci_get_err_ccb but SACT 4000 != 0?
ahci_put_err_ccb but SACT 4000 != 0?
ahci_get_err_ccb but SACT 8000 != 0?
ahci_put_err_ccb but SACT 8000 != 0?
ahci_get_err_ccb but SACT 8000 != 0?
ahci_put_err_ccb but SACT 8000 != 0?
ahci_get_err_ccb but SACT 8000 != 0?
ahci_put_err_ccb but SACT 8000 != 0?
ahci_get_err_ccb but SACT 8000 != 0?
ahci_put_err_ccb but SACT 8000 != 0?
ahci_get_err_ccb but SACT 8000 != 0?
ahci_put_err_ccb but SACT 8000 != 0?
...

It cannot be shutfown. It seems that it has problems to unmount the hard
disks (sd3a or sd0a or sd2a):

/dev/sd4a on / type ffs (local, noatime, softdep)
/dev/sd4k on /home type ffs (local, noatime, nodev, nosuid, softdep)
/dev/sd4l on /share type ffs (local, noatime, nodev, nosuid, softdep)
/dev/sd4d on /tmp type ffs (local, noatime, nodev, nosuid, softdep)
/dev/sd4e on /usr type ffs (local, noatime, nodev, softdep)
/dev/sd4f on /usr/X11R6 type ffs (local, noatime, nodev, softdep)
/dev/sd4g on /usr/local type ffs (local, noatime, nodev, wxallowed, softdep)
/dev/sd4j on /usr/ports type ffs (local, noatime, nodev, nosuid, wxallowed, 
softdep)
/dev/sd4h on /var type ffs (local, noatime, nodev, nosuid, softdep)
/dev/sd3a on /mnt/backup0 type ffs (local, noatime, nosuid, softdep)
/dev/sd0a on /mnt/backup1 type ffs (local, nodev, nosuid)
/dev/sd2a on /mnt/wd0 type ffs (local, noatime, nodev, nosuid, softdep)

Maybe someone has an idea or a diff to test.

OpenBSD 7.2-current (GENERIC.MP) #822: Fri Oct 28 21:59:48 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17023107072 (16234MB)
avail mem = 16489771008 (15725MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x8b1a8000 (89 entries)
bios0: vendor American Megatrends Inc. version "3801" date 03/14/2018
bios0: ASUSTeK COMPUTER INC. Z170-PRO
efi0 at bios0: UEFI 2.5
efi0: American Megatrends rev 0x5000c
acpi0 at bios0: ACPI 6.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT DBG2 MCFG SSDT FIDT SSDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT DBGP
acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
SIO1(S3) PS2K(S4) PS2M(S4) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) 
PXSX(S4) RP12(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4010.75 MHz, 06-5e-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
4-way L2 cache, 8MB 64b/line