tor-browser Segmentation fault after upgrade to 7.2 release
>Synopsis: tor-browser Segmentation fault after upgrade to 7.2 release >Category: >Environment: System : OpenBSD 7.2 Details : OpenBSD 7.2 (GENERIC.MP) #758: Tue Sep 27 11:57:54 MDT 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 tor-browser: 11.5.4 >Description: I gets the core dump file, but don't know how attach it. >How-To-Repeat: Run tor-browser. >Fix: Unknown.
Re: 7.2: tsc timecounter running too fast on ESXi 7.5
On 10/26/22 23:07, Scott Cheloha wrote: In summary: - OpenBSD 7.2 amd64 kernel TSC and lapic calibration is broken on (at least) some ESXi 6.0 and ESXi 7.5 hosts under the VM configuration "FreeBSD (32-bit)". The ACPI PM timer seemingly accelerates when we read it repeatedly during boot. - Workaround 1 is to change the configuration to "FreeBSD (64-bit)". - Workaround 2 is to not install acpitimer_delay() with delay_init() during acpitimerattach(). Maybe you noticed already, but in OpenBSD-Misc list I have suggested to make VMware Tools driver to advertise OS as 'FreeBSD 64-bit' OS, not 32-bit version, making workaround 1 a default system setting. https://marc.info/?l=openbsd-misc=166680569110622=2
Re: 7.2: tsc timecounter running too fast on ESXi 7.5
On Wed, Oct 26, 2022 at 03:23:51PM +0200, Kalabic S. wrote: > On 26/10/2022 11:33, Scott Cheloha wrote: > > There might be a second workaround. Kalabic mentions here in the > > other thread about this problem: > > > > https://marc.info/?l=openbsd-bugs=14949825616=2 > > > > ... that changing the ESXi option "Guest OS Version" from "FreeBSD > > (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his > > version of ESXi. Does that work for you? I don't know what the other > > consequences of that configuration change are, but it might be worth a > > try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching > > -current. > > > > Do you have VMware support? Is there any way for you to report this > > problem to them? It's unlikely they explicitly support running an > > OpenBSD guest, but it's plausible this issue could affect other > > operating systems. I can't imagine OpenBSD is reading the ACPI PM > > timer differently than Linux or FreeBSD. > > > > Maybe related or not, but there's official paper from VMware that describes > several known timekeeping issues and how to correct or work around them: > https://www.cse.psu.edu/~buu1/teaching/spring06/papers/vmware-timing.pdf I did see this, thanks for posting it. It's not immediately useful here, though. > Also pardon my ignorance about TSC counters and related stuff, but just > looking at FreeBSD related code it seems to take into account the fact it is > running as a hypervisor guest (ESXi or Xen). > https://github.com/freebsd/freebsd-src/blob/main/sys/x86/x86/tsc.c > > Is there a detail that makes a difference when different "Guest OS Version" > is used? Note that I have no idea what is happening there. > > So, just like some AMD related improvements for TSC were introduced into > OpenBSD recently, maybe this issue can be properly solved only by doing > something similar for guests on hypervisor? I would like to derive the TSC and lapic frequency from the hypervisor CPUID leaves when they are available to avoid calibration. It's on my todo list.
Re: 7.2: tsc timecounter running too fast on ESXi 7.5
On Wed, Oct 26, 2022 at 07:36:28AM -0700, James J. Lippard wrote: > On Wed, Oct 26, 2022 at 04:33:23AM -0500, Scott Cheloha wrote: > > Thank you for testing, let's take a look. > > [...] > > I don't know how to explain this. Maybe another developer will read > > this and spot something I'm missing. Or maybe this is a known issue > > and I'm just not finding a reference to it online. > > > > The simplest workaround is to skip installing acpitimer_delay() with > > delay_init() during acpitimerattach(). The attached patch does this. > > Can confirm that this works. Good. > > I don't know if this problem persists after boot. If it does, using > > the acpitimer0 timecounter may yield strange results in the VM. I > > recommend not using the acpitimer0 timecounter until the problem is > > better understood. A calibrated TSC is going to be a better > > timecounter anyway. > > > > There might be a second workaround. Kalabic mentions here in the > > other thread about this problem: > > > > https://marc.info/?l=openbsd-bugs=14949825616=2 > > > > ... that changing the ESXi option "Guest OS Version" from "FreeBSD > > (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his > > version of ESXi. Does that work for you? I don't know what the other > > consequences of that configuration change are, but it might be worth a > > try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching > > -current. > > I can also confirm that this works as a workaround on the stock 7.2 kernel. > I also booted with the last kernel with debugging info with this workaround; > dmesg for that is below. Even better, and thank you for double-checking with the patched kernel. > > Do you have VMware support? Is there any way for you to report this > > problem to them? It's unlikely they explicitly support running an > > OpenBSD guest, but it's plausible this issue could affect other > > operating systems. I can't imagine OpenBSD is reading the ACPI PM > > timer differently than Linux or FreeBSD. > > Unfortunately not, I only use the free vSphere ESXi. Drat. > OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022 > lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > [snip] > measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: > count 14211444 14569397 tsc 38873326169 39063325035 usecs 9: 197660 Hz > measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: > cycles 357958 tsc 190001742: 188842 Hz > measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: > count 14939119 15297049 tsc 39259571275 39449557759 usecs 3: 187839 Hz > measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: > cycles 357955 tsc 18897: 186316 Hz > measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: > count 15666102 16024022 tsc 39645448713 39835430133 usecs 0: 194200 Hz > measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: > cycles 357954 tsc 18157: 184223 Hz > [snip] > acpihpet0 at acpi0: 14318179 Hz > measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: > count 8315 1439858 tsc 42184173245 42374137028 usecs 99980: 1900017833 Hz > measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: > cycles 1431819 tsc 18907: 187610 Hz > measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: > count 2894563 4326110 tsc 42567173659 42757137699 usecs 99981: 191400 Hz > measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: > cycles 1431826 tsc 19836: 187611 Hz > measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: > count 5781139 7212684 tsc 42950217351 43140181114 usecs 99980: 1900017633 Hz > measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: > cycles 1431826 tsc 19909: 188341 Hz This looks right. In summary: - OpenBSD 7.2 amd64 kernel TSC and lapic calibration is broken on (at least) some ESXi 6.0 and ESXi 7.5 hosts under the VM configuration "FreeBSD (32-bit)". The ACPI PM timer seemingly accelerates when we read it repeatedly during boot. - Workaround 1 is to change the configuration to "FreeBSD (64-bit)". - Workaround 2 is to not install acpitimer_delay() with delay_init() during acpitimerattach().
Re: 7.2: tsc timecounter running too fast on ESXi 7.5
On Wed, Oct 26, 2022 at 04:33:23AM -0500, Scott Cheloha wrote: > Thank you for testing, let's take a look. > [...] > I don't know how to explain this. Maybe another developer will read > this and spot something I'm missing. Or maybe this is a known issue > and I'm just not finding a reference to it online. > > The simplest workaround is to skip installing acpitimer_delay() with > delay_init() during acpitimerattach(). The attached patch does this. Can confirm that this works. > I don't know if this problem persists after boot. If it does, using > the acpitimer0 timecounter may yield strange results in the VM. I > recommend not using the acpitimer0 timecounter until the problem is > better understood. A calibrated TSC is going to be a better > timecounter anyway. > > There might be a second workaround. Kalabic mentions here in the > other thread about this problem: > > https://marc.info/?l=openbsd-bugs=14949825616=2 > > ... that changing the ESXi option "Guest OS Version" from "FreeBSD > (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his > version of ESXi. Does that work for you? I don't know what the other > consequences of that configuration change are, but it might be worth a > try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching > -current. I can also confirm that this works as a workaround on the stock 7.2 kernel. I also booted with the last kernel with debugging info with this workaround; dmesg for that is below. > Do you have VMware support? Is there any way for you to report this > problem to them? It's unlikely they explicitly support running an > OpenBSD guest, but it's plausible this issue could affect other > operating systems. I can't imagine OpenBSD is reading the ACPI PM > timer differently than Linux or FreeBSD. Unfortunately not, I only use the free vSphere ESXi. OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022 lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 6424494080 (6126MB) avail mem = 6212374528 (5924MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe0010 (242 entries) bios0: vendor Phoenix Technologies LTD version "6.00" date 11/12/2020 bios0: VMware, Inc. VMware Virtual Platform acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S1 S4 S5 acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) S8F0(S3) S16F(S3) S18F(S3) S22F(S3) S23F(S3) S24F(S3) S25F(S3) PE40(S3) S1F0(S3) PE50(S3) S1F0(S3) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.76 MHz, 06-56-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 8-way L2 cache, 9MB 64b/line 12-way L3 cache measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 14211444 14569397 tsc 38873326169 39063325035 usecs 9: 197660 Hz measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 357958 tsc 190001742: 188842 Hz measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 14939119 15297049 tsc 39259571275 39449557759 usecs 3: 187839 Hz measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 357955 tsc 18897: 186316 Hz measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 15666102 16024022 tsc 39645448713 39835430133 usecs 0: 194200 Hz measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 357954 tsc 18157: 184223 Hz cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 65MHz delay_init: changing delay implementation: 0 -> 3000 cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.69 MHz, 06-56-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 8-way L2 cache, 9MB 64b/line 12-way L3 cache cpu1: smt 0, core 0, package 2
menulibre crashes after upgrading to 7.2
>Synopsis: menulibre crashes after upgrading to 7.2 >Category: desktop, gui >Environment: System : OpenBSD 7.2 Details : OpenBSD 7.2 (GENERIC.MP) #758: Tue Sep 27 11:57:54 MDT 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 Information for inst:menulibre-2.3.0p0 >Description: I had upgrade 7.1 to 7.2 and run menulibre program. Traceback (most recent call last): File "/usr/local/bin/menulibre", line 44, in import menulibre File "/usr/local/lib/python3.9/site-packages/menulibre/__init__.py", line 23, in from menulibre import MenulibreApplication File "/usr/local/lib/python3.9/site-packages/menulibre/MenulibreApplication.py", line 38, in from .util import escapeText, getCurrentDesktop, find_program, getProcessList ImportError: cannot import name 'getProcessList' from 'menulibre.util' (/usr/local/lib/python3.9/site-packages/menulibre/util.py) >How-To-Repeat: Run menulibre. >Fix: Unknown.
Re: 7.2: tsc timecounter running too fast on ESXi 7.5
On 26/10/2022 11:33, Scott Cheloha wrote: There might be a second workaround. Kalabic mentions here in the other thread about this problem: https://marc.info/?l=openbsd-bugs=14949825616=2 ... that changing the ESXi option "Guest OS Version" from "FreeBSD (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his version of ESXi. Does that work for you? I don't know what the other consequences of that configuration change are, but it might be worth a try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching -current. Do you have VMware support? Is there any way for you to report this problem to them? It's unlikely they explicitly support running an OpenBSD guest, but it's plausible this issue could affect other operating systems. I can't imagine OpenBSD is reading the ACPI PM timer differently than Linux or FreeBSD. Maybe related or not, but there's official paper from VMware that describes several known timekeeping issues and how to correct or work around them: https://www.cse.psu.edu/~buu1/teaching/spring06/papers/vmware-timing.pdf Also pardon my ignorance about TSC counters and related stuff, but just looking at FreeBSD related code it seems to take into account the fact it is running as a hypervisor guest (ESXi or Xen). https://github.com/freebsd/freebsd-src/blob/main/sys/x86/x86/tsc.c Is there a detail that makes a difference when different "Guest OS Version" is used? Note that I have no idea what is happening there. So, just like some AMD related improvements for TSC were introduced into OpenBSD recently, maybe this issue can be properly solved only by doing something similar for guests on hypervisor? I have found that similar issues were reported for FreeBSD and other virtual machines previously: - "Time drift/system clock too fast on a PFSense VM": https://forum.netgate.com/topic/108653/time-drift-system-clock-too-fast-on-a-pfsense-vm - "Clock on ADC VPX hosted on VMware is running very fast causing exchange issues": https://support.citrix.com/article/CTX335923/clock-on-adc-vpx-hosted-on-vmware-is-running-very-fast-causing-exchange-issues - ... and more can easily be googled.
Re: 7.2: tsc timecounter running too fast on ESXi 7.5
On Tue, Oct 25, 2022 at 09:00:33PM -0700, James J. Lippard wrote: > On Tue, Oct 25, 2022 at 09:20:05PM -0500, Scott Cheloha wrote: > > On Tue, Oct 25, 2022 at 02:24:24PM -0700, James J. Lippard wrote: > > > I'm one of several people experiencing this issue with OpenBSD 7.2 on > > > VMware ESXi 7.5. Scott C. has given me help in trying to track the issue > > > down; a patched -current kernel to remove the acpi_delay code added in > > > 7.2 makes the issue go away. > > > > Thanks for your report. > > > > I have one more patch for you to try. Attached at the end. Hopefully > > it will confirm the root problem. Send the resulting dmesg and we'll > > see whether the problem is actually the acpitimer(4). > > >[...] > > Okay, here is the third patch. Revert the earlier one and boot this. > > Here's the dmesg output running with this new patch: Thank you for testing, let's take a look. > OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022 > lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > [snip] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.77 MHz, 06-56-03 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN > cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB > 64b/line 8-way L2 cache, 9MB 64b/line 12-way L3 cache > measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: > count 12840801 13198720 tsc 8350048970 8540029885 usecs 0: 189149 Hz > measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: > cycles 357969 tsc 62919804: 629172553 Hz > measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: > count 13562994 13686416 tsc 8608912502 8798895525 usecs 34479: (failed) > measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: > count 13692684 14050605 tsc 880961 8992204988 usecs 0: 1900010271 Hz > measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: > cycles 357969 tsc 64754894: 647522710 Hz When we do "indirect calibration," we're using delay(9) to spin for ~100,000 microseconds in between reads of the reference timer and the TSC. In this case, the underlying delay(9) implementation is i8254_delay(). This method calibrates the TSC to 1900 MHz. For example, in the first indirect calibration round we get: (tsc2 - tsc1) * acpitimer-frequency / (acpitimer2 - acpitimer1) = (8540029885 - 8350048970) * 3579545 / (13198720 - 12840801) = 187581 or roughly 1900 MHz. The result printed in the dmesg (189149) is a little different because the math in the kernel is a little different. The third indirect calibration round yields basically the same result (1900010271). When we do "direct calibration," we're reading the reference timer itself repeatedly to spin for ~100,000 microseconds and accumulating a count of reference timer cycles and TSC cycles as we spin. This method calibrates the TSC to ~630 MHz. For example, in the first direct calibration round we get: tsc-cycles * acpitimer-frequency / acpitimer-cycles = 62919804 * 3579545 / 357969 = 629172553 or roughly 630 MHz. The second indirect calibration round yields a similar result (647522710). Based on these numbers, I think the virtual ACPI PM Timer on this ESXi VM accelerates beyond 3579545 Hz when it is read repeatedly and then decelerates back down to the nominal frequency when it is read less frequently. I don't think the TSC itself has a non-constant frequency. When we calibrate it later with the HPET, both indirect calibration using the local apic timer to spin and direct calibration using only the HPET yield a TSC frequency of ~1900 MHz: > [snip] > cpu0: apic clock running at 65MHz > delay_init: changing delay implementation: 0 -> 3000 (Here we switch from i8254_delay() to lapic_delay().) > [snip] > acpihpet0 at acpi0: 14318179 Hz > measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: > count 7984 1439544 tsc 11218877272 11408843078 usecs 99981: 1900019063 Hz > measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: > cycles 1431817 tsc 18744: 188634 Hz > measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: > count 2894172 4325743 tsc 11601869571 11791837035 usecs 99982: 1900016642 Hz > measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: > cycles 1431826 tsc 19912: 188371 Hz > measure_tsc_freq: indirect calibration with