On Wed, Aug 17, 2022 at 12:37:52AM -0500, Scott Cheloha wrote: > On Wed, Aug 17, 2022 at 02:28:14PM +1000, Jonathan Gray wrote: > > On Tue, Aug 16, 2022 at 11:53:51AM -0500, Scott Cheloha wrote: > > > On Sun, Aug 14, 2022 at 11:24:37PM -0500, Scott Cheloha wrote: > > > > > > > > In the future when the LAPIC timer is run in oneshot mode there will > > > > be no lapic_delay(). > > > > > > > > [...] > > > > > > > > This is *very* bad for older amd64 machines, because you are left with > > > > i8254_delay(). > > > > > > > > I would like to offer a less awful delay(9) implementation for this > > > > class of hardware. Otherwise we may trip over bizarre phantom bugs on > > > > MP kernels because only one CPU can read the i8254 at a time. > > > > > > > > [...] > > > > > > > > Real i386 hardware should be fine. Later models with an ACPI PM timer > > > > will be fine using acpitimer_delay() instead of i8254_delay(). > > > > > > > > [...] > > > > > > > > Here are the sample measurements from my 2017 laptop (kaby lake > > > > refresh) running the attached patch. It takes longer than a > > > > microsecond to read either of the ACPI timers. The PM timer is better > > > > than the HPET. The HPET is a bit better than the i8254. I hope the > > > > numbers are a little better on older hardware. > > > > > > > > acpitimer_test_delay: expected 0.000001000 actual 0.000010638 > > > > error 0.000009638 > > > > acpitimer_test_delay: expected 0.000010000 actual 0.000015464 > > > > error 0.000005464 > > > > acpitimer_test_delay: expected 0.000100000 actual 0.000107619 > > > > error 0.000007619 > > > > acpitimer_test_delay: expected 0.001000000 actual 0.001007275 > > > > error 0.000007275 > > > > acpitimer_test_delay: expected 0.010000000 actual 0.010007891 > > > > error 0.000007891 > > > > > > > > acpihpet_test_delay: expected 0.000001000 actual 0.000022208 > > > > error 0.000021208 > > > > acpihpet_test_delay: expected 0.000010000 actual 0.000031690 > > > > error 0.000021690 > > > > acpihpet_test_delay: expected 0.000100000 actual 0.000112647 > > > > error 0.000012647 > > > > acpihpet_test_delay: expected 0.001000000 actual 0.001021480 > > > > error 0.000021480 > > > > acpihpet_test_delay: expected 0.010000000 actual 0.010013736 > > > > error 0.000013736 > > > > > > > > i8254_test_delay: expected 0.000001000 actual 0.000040110 > > > > error 0.000039110 > > > > i8254_test_delay: expected 0.000010000 actual 0.000039471 > > > > error 0.000029471 > > > > i8254_test_delay: expected 0.000100000 actual 0.000128031 > > > > error 0.000028031 > > > > i8254_test_delay: expected 0.001000000 actual 0.001024586 > > > > error 0.000024586 > > > > i8254_test_delay: expected 0.010000000 actual 0.010021859 > > > > error 0.000021859 > > > > > > Attched is an updated patch. I left the test measurement code in > > > place because I would like to see a test on a real i386 machine, just > > > to make sure it works as expected. I can't imagine why it wouldn't > > > work, but we should never assume anything. > > > > > > Changes from v1: > > > > > > - Actually set delay_func from acpitimerattach() and > > > acpihpet_attach(). > > > > > > I think it's safe to assume, on real hardware, that the ACPI PMT is > > > preferable to the i8254 and the HPET is preferable to both of them. > > > > > > This is not *always* true, but it is true on the older machines that > > > can't use tsc_delay(), so the assumption works in practice. > > > > > > Outside of those three timers, the hierarchy gets murky. There are > > > other timers that are better than the HPET, but they aren't always > > > available. If those timers are already providing delay_func this > > > code does not usurp them. > > > > As I understand it, you want lapic to be in one-shot mode for something > > along the lines of tickless. > > Yes. > > Although "tickless" is a misnomer. > > > So you are trying to find MP machines > > where TSC is not useable for delay? > > Right. Those are the only machines where it's relevant to consider > the accuracy of acpitimer_delay() or acpihpet_delay()... unless I've > forgotten something. > > > TSC is only considered for delay if the invariant and constant flags > > are set. > > invariant: > > "In the Core i7 and future processor generations, the TSC will continue > > to run in the deepest C-states. Therefore, the TSC will run at a > > constant rate in all ACPI P-, C-. and T-states. Support for this feature > > is indicated by CPUID.0x8000_0007.EDX[8]. On processors with invariant > > TSC support, the OS may use the TSC for wall clock timer services > > (instead of ACPI or HPET timers). TSC reads are much more efficient and > > do not incur the overhead associated with a ring transition or access to > > a platform resource." > > > > constant: > > runs at a constant rate across frequency/P state changes > > > > Intel constant > > (family == 0x0f && model >= 0x03) || (family == 0x06 && model >= 0x0e) > > > > family 0x06 model 0x0e is yonah, core solo/duo > > Intel CPUID doc has > > "Intel Core Duo processor, Intel Core Solo processor, model 0Eh. > > All processors are manufactured using the 65 nm process." > > > > family 0x0f model 0x03 > > "Pentium 4 processor, Intel Xeon processor, Intel Celeron D processor. > > All processors are model 03h and manufactured using the 90 nm process." > > > > VIA constant > > model >= 0x0f > > model 0x0f is Nano > > > > AMD constant > > CPUIDEDX_ITSC set > > > > invariant > > CPUIDEDX_ITSC set > > This matches my understanding of the situation. > > > > - Duplicate test measurement code from amd64/lapic.c into i386/lapic.c. > > > Will be removed in the committed version. > > > > > > - Use bus_space_read_8() in acpihpet.c if it's available. The HPET is > > > a 64-bit counter and the spec permits 32-bit or 64-bit aligned access. > > > > > > As one might predict, this cuts the overhead in half because we're > > > doing half as many reads. > > > > > > This part can go into a separate commit, but I thought it was neat > > > so I'm including it here. > > > > This should probably use __LP64__ as if_xge.c does > > Ack, I will do it that way in a subsequent patch. It seems a bit > indirect to do it that way though. > > > > One remaining question I have: > > > > > > Is there a nice way to test whether ACPI PMT support is compiled into > > > the kernel? We can assume the existence of i8254_delay() because > > > clock.c is required on amd64 and i386. However, acpitimer.c is a > > > optional, so acpitimer_delay() isn't necessarily there. > > > > > > I would rather not introduce a hard requirement on acpitimer.c into > > > acpihpet.c if there's an easy way to check for the latter. > > > > > > Any ideas? > > > > the normal way would be to add "needs-flag" > > then config(8) will create a acpitimer.h with NACPITIMER > > see files.conf(5) > > Ah! Perfect, thank you. > > > As it stands RAMDISK does not have acpitimer so with your diff those > > kernels do not link. > > > > Index: files.acpi > > =================================================================== > > RCS file: /cvs/src/sys/dev/acpi/files.acpi,v > > retrieving revision 1.64 > > diff -u -p -r1.64 files.acpi > > --- files.acpi 29 Dec 2021 18:40:19 -0000 1.64 > > +++ files.acpi 17 Aug 2022 02:30:32 -0000 > > @@ -13,7 +13,7 @@ file dev/acpi/acpidebug.c acpi & ddb > > # ACPI timer > > device acpitimer > > attach acpitimer at acpi > > -file dev/acpi/acpitimer.c acpitimer > > +file dev/acpi/acpitimer.c acpitimer needs-flag > > > > # AC device > > device acpiac > > Updated patch is attached below. The RAMDISK kernel now builds. > > > > Anyone have i386 hardware results? If I'm reading the timeline right, > > > most P6 machines and beyond (NetBurst, etc) will have an ACPI PMT. I > > > don't know if any real x86 motherboards shipped with an HPET, but it's > > > possible. > > > > by "real x86" you mean machines that can't run amd64 I gather > > Yes, that was my meaning. > > > I'm sure i386 releases are built on a machine with HPET > > > > an i386 only machine without HPET is the x40 > > > > cpu0: Intel(R) Pentium(R) M processor 1200MHz ("GenuineIntel" -class) 1.20 > > GHz, 06-09-05 > > cpu0: > > FPU,V86,DE,PSE,TSC,MSR,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,TM,PBE,EST,TM2,PERF,MELTDOWN > > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > > cpu0: apic clock running at 99MHz > > acpitimer0 at acpi0: 3579545 Hz, 24 bits > > acpitimer_test_delay: expected 0.000001000 actual 0.000001000 error > > 0.000000000 > > acpitimer_test_delay: expected 0.000010000 actual 0.000001000 error > > -1.999991000 > > acpitimer_test_delay: expected 0.000100000 actual 0.000001000 error > > -1.999901000 > > acpitimer_test_delay: expected 0.001000000 actual 0.000001000 error > > -1.999001000 > > acpitimer_test_delay: expected 0.010000000 actual 0.000001000 error > > -1.990001000 > > acpihpet_test_delay: (no hpet attached) > > acpihpet_test_delay: (no hpet attached) > > acpihpet_test_delay: (no hpet attached) > > acpihpet_test_delay: (no hpet attached) > > acpihpet_test_delay: (no hpet attached) > > i8254_test_delay: expected 0.000001000 actual 0.000001000 error 0.000000000 > > i8254_test_delay: expected 0.000010000 actual 0.000001000 error -1.999991000 > > i8254_test_delay: expected 0.000100000 actual 0.000001000 error -1.999901000 > > i8254_test_delay: expected 0.001000000 actual 0.000001000 error -1.999001000 > > i8254_test_delay: expected 0.010000000 actual 0.000001000 error -1.990001000 > > kern.timecounter.hardware=acpitimer0 > > kern.timecounter.choice=i8254(0) acpitimer0(1000) > > Thank you for testing. You have quite a collection. > > But those results mean I've misunderstood something. The negative > error values mean a real timecounter is ticking at this point on amd64 > but *not* on i386. Or at least, not on that machine. > > Spinning for 20,000 microseconds should be enough time for one > hardclock to run if interrupts are enabled and the lapic timer is > running. > > Hmmm.
should be NACPITIMER > 0 not NACPITIMER > 1 It seems to me it would be cleaner if the decision of what to use for delay could be moved into an md file. Or abstract it by having a numeric weight like timecounters or driver match return numbers. > > Index: acpitimer.c > =================================================================== > RCS file: /cvs/src/sys/dev/acpi/acpitimer.c,v > retrieving revision 1.15 > diff -u -p -r1.15 acpitimer.c > --- acpitimer.c 6 Apr 2022 18:59:27 -0000 1.15 > +++ acpitimer.c 17 Aug 2022 05:37:14 -0000 > @@ -18,6 +18,7 @@ > #include <sys/param.h> > #include <sys/systm.h> > #include <sys/device.h> > +#include <sys/stdint.h> > #include <sys/timetc.h> > > #include <machine/bus.h> > @@ -25,10 +26,13 @@ > #include <dev/acpi/acpireg.h> > #include <dev/acpi/acpivar.h> > > +struct acpitimer_softc; > + > int acpitimermatch(struct device *, void *, void *); > void acpitimerattach(struct device *, struct device *, void *); > - > +void acpitimer_delay(int); > u_int acpi_get_timecount(struct timecounter *tc); > +uint32_t acpitimer_read(struct acpitimer_softc *); > > static struct timecounter acpi_timecounter = { > .tc_get_timecount = acpi_get_timecount, > @@ -98,18 +102,45 @@ acpitimerattach(struct device *parent, s > acpi_timecounter.tc_priv = sc; > acpi_timecounter.tc_name = sc->sc_dev.dv_xname; > tc_init(&acpi_timecounter); > + > +#if defined(__amd64__) || defined(__i386__) > + if (delay_func == i8254_delay) > + delay_func = acpitimer_delay; > +#endif > #if defined(__amd64__) > extern void cpu_recalibrate_tsc(struct timecounter *); > cpu_recalibrate_tsc(&acpi_timecounter); > #endif > } > > +void > +acpitimer_delay(int usecs) > +{ > + uint64_t count = 0, cycles; > + struct acpitimer_softc *sc = acpi_timecounter.tc_priv; > + uint32_t mask = acpi_timecounter.tc_counter_mask; > + uint32_t val1, val2; > + > + val2 = acpitimer_read(sc); > + cycles = usecs * acpi_timecounter.tc_frequency / 1000000; > + while (count < cycles) { > + CPU_BUSY_CYCLE(); > + val1 = val2; > + val2 = acpitimer_read(sc); > + count += (val2 - val1) & mask; > + } > +} > > u_int > acpi_get_timecount(struct timecounter *tc) > { > - struct acpitimer_softc *sc = tc->tc_priv; > - u_int u1, u2, u3; > + return acpitimer_read(tc->tc_priv); > +} > + > +uint32_t > +acpitimer_read(struct acpitimer_softc *sc) > +{ > + uint32_t u1, u2, u3; > > u2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, 0); > u3 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, 0); > Index: acpihpet.c > =================================================================== > RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v > retrieving revision 1.26 > diff -u -p -r1.26 acpihpet.c > --- acpihpet.c 6 Apr 2022 18:59:27 -0000 1.26 > +++ acpihpet.c 17 Aug 2022 05:37:14 -0000 > @@ -18,6 +18,7 @@ > #include <sys/param.h> > #include <sys/systm.h> > #include <sys/device.h> > +#include <sys/stdint.h> > #include <sys/timetc.h> > > #include <machine/bus.h> > @@ -26,12 +27,14 @@ > #include <dev/acpi/acpivar.h> > #include <dev/acpi/acpidev.h> > > +#include "acpitimer.h" > + > int acpihpet_attached; > > int acpihpet_match(struct device *, void *, void *); > void acpihpet_attach(struct device *, struct device *, void *); > int acpihpet_activate(struct device *, int); > - > +void acpihpet_delay(int); > u_int acpihpet_gettime(struct timecounter *tc); > > uint64_t acpihpet_r(bus_space_tag_t _iot, bus_space_handle_t _ioh, > @@ -262,15 +265,38 @@ acpihpet_attach(struct device *parent, s > freq = 1000000000000000ull / period; > printf(": %lld Hz\n", freq); > > - hpet_timecounter.tc_frequency = (uint32_t)freq; > + hpet_timecounter.tc_frequency = freq; > hpet_timecounter.tc_priv = sc; > hpet_timecounter.tc_name = sc->sc_dev.dv_xname; > tc_init(&hpet_timecounter); > + > +#if defined(__amd64__) || defined(__i386__) > + if (delay_func == i8254_delay) > + delay_func = acpihpet_delay; > +#if NACPITIMER > 1 > + extern void acpitimer_delay(int); > + if (delay_func == acpitimer_delay) > + delay_func = acpihpet_delay; > +#endif > +#endif /* defined(__amd64__) || defined(__i386__) */ > + > #if defined(__amd64__) > extern void cpu_recalibrate_tsc(struct timecounter *); > cpu_recalibrate_tsc(&hpet_timecounter); > #endif > acpihpet_attached++; > +} > + > +void > +acpihpet_delay(int usecs) > +{ > + uint64_t c, s; > + struct acpihpet_softc *sc = hpet_timecounter.tc_priv; > + > + s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); > + c = usecs * hpet_timecounter.tc_frequency / 1000000; > + while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) > + CPU_BUSY_CYCLE(); > } > > u_int > Index: files.acpi > =================================================================== > RCS file: /cvs/src/sys/dev/acpi/files.acpi,v > retrieving revision 1.64 > diff -u -p -r1.64 files.acpi > --- files.acpi 29 Dec 2021 18:40:19 -0000 1.64 > +++ files.acpi 17 Aug 2022 05:37:14 -0000 > @@ -13,7 +13,7 @@ file dev/acpi/acpidebug.c acpi & ddb > # ACPI timer > device acpitimer > attach acpitimer at acpi > -file dev/acpi/acpitimer.c acpitimer > +file dev/acpi/acpitimer.c acpitimer needs-flag > > # AC device > device acpiac >