On Tue, Aug 16, 2022 at 11:53:51AM -0500, Scott Cheloha wrote: > On Sun, Aug 14, 2022 at 11:24:37PM -0500, Scott Cheloha wrote: > > > > In the future when the LAPIC timer is run in oneshot mode there will > > be no lapic_delay(). > > > > [...] > > > > This is *very* bad for older amd64 machines, because you are left with > > i8254_delay(). > > > > I would like to offer a less awful delay(9) implementation for this > > class of hardware. Otherwise we may trip over bizarre phantom bugs on > > MP kernels because only one CPU can read the i8254 at a time. > > > > [...] > > > > Real i386 hardware should be fine. Later models with an ACPI PM timer > > will be fine using acpitimer_delay() instead of i8254_delay(). > > > > [...] > > > > Here are the sample measurements from my 2017 laptop (kaby lake > > refresh) running the attached patch. It takes longer than a > > microsecond to read either of the ACPI timers. The PM timer is better > > than the HPET. The HPET is a bit better than the i8254. I hope the > > numbers are a little better on older hardware. > > > > acpitimer_test_delay: expected 0.000001000 actual 0.000010638 error > > 0.000009638 > > acpitimer_test_delay: expected 0.000010000 actual 0.000015464 error > > 0.000005464 > > acpitimer_test_delay: expected 0.000100000 actual 0.000107619 error > > 0.000007619 > > acpitimer_test_delay: expected 0.001000000 actual 0.001007275 error > > 0.000007275 > > acpitimer_test_delay: expected 0.010000000 actual 0.010007891 error > > 0.000007891 > > > > acpihpet_test_delay: expected 0.000001000 actual 0.000022208 error > > 0.000021208 > > acpihpet_test_delay: expected 0.000010000 actual 0.000031690 error > > 0.000021690 > > acpihpet_test_delay: expected 0.000100000 actual 0.000112647 error > > 0.000012647 > > acpihpet_test_delay: expected 0.001000000 actual 0.001021480 error > > 0.000021480 > > acpihpet_test_delay: expected 0.010000000 actual 0.010013736 error > > 0.000013736 > > > > i8254_test_delay: expected 0.000001000 actual 0.000040110 error > > 0.000039110 > > i8254_test_delay: expected 0.000010000 actual 0.000039471 error > > 0.000029471 > > i8254_test_delay: expected 0.000100000 actual 0.000128031 error > > 0.000028031 > > i8254_test_delay: expected 0.001000000 actual 0.001024586 error > > 0.000024586 > > i8254_test_delay: expected 0.010000000 actual 0.010021859 error > > 0.000021859 > > Attched is an updated patch. I left the test measurement code in > place because I would like to see a test on a real i386 machine, just > to make sure it works as expected. I can't imagine why it wouldn't > work, but we should never assume anything. > > Changes from v1: > > - Actually set delay_func from acpitimerattach() and > acpihpet_attach(). > > I think it's safe to assume, on real hardware, that the ACPI PMT is > preferable to the i8254 and the HPET is preferable to both of them. > > This is not *always* true, but it is true on the older machines that > can't use tsc_delay(), so the assumption works in practice. > > Outside of those three timers, the hierarchy gets murky. There are > other timers that are better than the HPET, but they aren't always > available. If those timers are already providing delay_func this > code does not usurp them.
As I understand it, you want lapic to be in one-shot mode for something along the lines of tickless. So you are trying to find MP machines where TSC is not useable for delay? TSC is only considered for delay if the invariant and constant flags are set. invariant: "In the Core i7 and future processor generations, the TSC will continue to run in the deepest C-states. Therefore, the TSC will run at a constant rate in all ACPI P-, C-. and T-states. Support for this feature is indicated by CPUID.0x8000_0007.EDX[8]. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or access to a platform resource." constant: runs at a constant rate across frequency/P state changes Intel constant (family == 0x0f && model >= 0x03) || (family == 0x06 && model >= 0x0e) family 0x06 model 0x0e is yonah, core solo/duo Intel CPUID doc has "Intel Core Duo processor, Intel Core Solo processor, model 0Eh. All processors are manufactured using the 65 nm process." family 0x0f model 0x03 "Pentium 4 processor, Intel Xeon processor, Intel Celeron D processor. All processors are model 03h and manufactured using the 90 nm process." VIA constant model >= 0x0f model 0x0f is Nano AMD constant CPUIDEDX_ITSC set invariant CPUIDEDX_ITSC set > > - Duplicate test measurement code from amd64/lapic.c into i386/lapic.c. > Will be removed in the committed version. > > - Use bus_space_read_8() in acpihpet.c if it's available. The HPET is > a 64-bit counter and the spec permits 32-bit or 64-bit aligned access. > > As one might predict, this cuts the overhead in half because we're > doing half as many reads. > > This part can go into a separate commit, but I thought it was neat > so I'm including it here. This should probably use __LP64__ as if_xge.c does > > One remaining question I have: > > Is there a nice way to test whether ACPI PMT support is compiled into > the kernel? We can assume the existence of i8254_delay() because > clock.c is required on amd64 and i386. However, acpitimer.c is a > optional, so acpitimer_delay() isn't necessarily there. > > I would rather not introduce a hard requirement on acpitimer.c into > acpihpet.c if there's an easy way to check for the latter. > > Any ideas? the normal way would be to add "needs-flag" then config(8) will create a acpitimer.h with NACPITIMER see files.conf(5) As it stands RAMDISK does not have acpitimer so with your diff those kernels do not link. Index: files.acpi =================================================================== RCS file: /cvs/src/sys/dev/acpi/files.acpi,v retrieving revision 1.64 diff -u -p -r1.64 files.acpi --- files.acpi 29 Dec 2021 18:40:19 -0000 1.64 +++ files.acpi 17 Aug 2022 02:30:32 -0000 @@ -13,7 +13,7 @@ file dev/acpi/acpidebug.c acpi & ddb # ACPI timer device acpitimer attach acpitimer at acpi -file dev/acpi/acpitimer.c acpitimer +file dev/acpi/acpitimer.c acpitimer needs-flag # AC device device acpiac > > Anyone have i386 hardware results? If I'm reading the timeline right, > most P6 machines and beyond (NetBurst, etc) will have an ACPI PMT. I > don't know if any real x86 motherboards shipped with an HPET, but it's > possible. by "real x86" you mean machines that can't run amd64 I gather I'm sure i386 releases are built on a machine with HPET an i386 only machine without HPET is the x40 cpu0: Intel(R) Pentium(R) M processor 1200MHz ("GenuineIntel" -class) 1.20 GHz, 06-09-05 cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,TM,PBE,EST,TM2,PERF,MELTDOWN mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz acpitimer0 at acpi0: 3579545 Hz, 24 bits acpitimer_test_delay: expected 0.000001000 actual 0.000001000 error 0.000000000 acpitimer_test_delay: expected 0.000010000 actual 0.000001000 error -1.999991000 acpitimer_test_delay: expected 0.000100000 actual 0.000001000 error -1.999901000 acpitimer_test_delay: expected 0.001000000 actual 0.000001000 error -1.999001000 acpitimer_test_delay: expected 0.010000000 actual 0.000001000 error -1.990001000 acpihpet_test_delay: (no hpet attached) acpihpet_test_delay: (no hpet attached) acpihpet_test_delay: (no hpet attached) acpihpet_test_delay: (no hpet attached) acpihpet_test_delay: (no hpet attached) i8254_test_delay: expected 0.000001000 actual 0.000001000 error 0.000000000 i8254_test_delay: expected 0.000010000 actual 0.000001000 error -1.999991000 i8254_test_delay: expected 0.000100000 actual 0.000001000 error -1.999901000 i8254_test_delay: expected 0.001000000 actual 0.000001000 error -1.999001000 i8254_test_delay: expected 0.010000000 actual 0.000001000 error -1.990001000 kern.timecounter.hardware=acpitimer0 kern.timecounter.choice=i8254(0) acpitimer0(1000) > > Here are my updated results with the bus_space_read_8 change: > > acpitimer_test_delay: expected 0.000001000 actual 0.000010607 error > 0.000009607 > acpitimer_test_delay: expected 0.000010000 actual 0.000015491 error > 0.000005491 > acpitimer_test_delay: expected 0.000100000 actual 0.000107734 error > 0.000007734 > acpitimer_test_delay: expected 0.001000000 actual 0.001008006 error > 0.000008006 > acpitimer_test_delay: expected 0.010000000 actual 0.010007042 error > 0.000007042 > > acpihpet_test_delay: expected 0.000001000 actual 0.000013282 error > 0.000012282 > acpihpet_test_delay: expected 0.000010000 actual 0.000022743 error > 0.000012743 > acpihpet_test_delay: expected 0.000100000 actual 0.000109826 error > 0.000009826 > acpihpet_test_delay: expected 0.001000000 actual 0.001012149 error > 0.000012149 > acpihpet_test_delay: expected 0.010000000 actual 0.010010841 error > 0.000010841 > > i8254_test_delay: expected 0.000001000 actual 0.000039767 error > 0.000038767 > i8254_test_delay: expected 0.000010000 actual 0.000039490 error > 0.000029490 > i8254_test_delay: expected 0.000100000 actual 0.000127800 error > 0.000027800 > i8254_test_delay: expected 0.001000000 actual 0.001023940 error > 0.000023940 > i8254_test_delay: expected 0.010000000 actual 0.010032127 error > 0.000032127 > > And the patch: > > Index: arch/amd64/amd64/lapic.c > =================================================================== > RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v > retrieving revision 1.60 > diff -u -p -r1.60 lapic.c > --- arch/amd64/amd64/lapic.c 15 Aug 2022 04:17:50 -0000 1.60 > +++ arch/amd64/amd64/lapic.c 16 Aug 2022 16:09:56 -0000 > @@ -466,6 +466,19 @@ lapic_initclocks(void) > lapic_startclock(); > > i8254_inittimecounter_simple(); > + > + extern void acpitimer_test_delay(int); > + extern void acpihpet_test_delay(int); > + extern void i8254_test_delay(int); > + int usec[] = { 1, 10, 100, 1000, 10000 }; > + size_t i; > + delay(20000); /* wait for real timecounter to activate */ > + for (i = 0; i < nitems(usec); i++) > + acpitimer_test_delay(usec[i]); > + for (i = 0; i < nitems(usec); i++) > + acpihpet_test_delay(usec[i]); > + for (i = 0; i < nitems(usec); i++) > + i8254_test_delay(usec[i]); > } > > > Index: arch/amd64/isa/clock.c > =================================================================== > RCS file: /cvs/src/sys/arch/amd64/isa/clock.c,v > retrieving revision 1.36 > diff -u -p -r1.36 clock.c > --- arch/amd64/isa/clock.c 13 Feb 2022 19:15:09 -0000 1.36 > +++ arch/amd64/isa/clock.c 16 Aug 2022 16:09:56 -0000 > @@ -266,6 +266,22 @@ i8254_delay(int n) > } > > void > +i8254_test_delay(int usecs) > +{ > + struct timespec ac, er, ex, t0, t1; > + > + nanouptime(&t0); > + i8254_delay(usecs); > + nanouptime(&t1); > + timespecsub(&t1, &t0, &ac); > + NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex); > + timespecsub(&ac, &ex, &er); > + printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n", > + __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec, > + er.tv_sec, er.tv_nsec); > +} > + > +void > rtcdrain(void *v) > { > struct timeout *to = (struct timeout *)v; > Index: arch/i386/i386/lapic.c > =================================================================== > RCS file: /cvs/src/sys/arch/i386/i386/lapic.c,v > retrieving revision 1.49 > diff -u -p -r1.49 lapic.c > --- arch/i386/i386/lapic.c 15 Aug 2022 04:17:50 -0000 1.49 > +++ arch/i386/i386/lapic.c 16 Aug 2022 16:09:56 -0000 > @@ -281,6 +281,19 @@ lapic_initclocks(void) > lapic_startclock(); > > i8254_inittimecounter_simple(); > + > + extern void acpitimer_test_delay(int); > + extern void acpihpet_test_delay(int); > + extern void i8254_test_delay(int); > + int usec[] = { 1, 10, 100, 1000, 10000 }; > + size_t i; > + delay(20000); /* wait for real timecounter to activate */ > + for (i = 0; i < nitems(usec); i++) > + acpitimer_test_delay(usec[i]); > + for (i = 0; i < nitems(usec); i++) > + acpihpet_test_delay(usec[i]); > + for (i = 0; i < nitems(usec); i++) > + i8254_test_delay(usec[i]); > } > > extern int gettick(void); /* XXX put in header file */ > Index: arch/i386/isa/clock.c > =================================================================== > RCS file: /cvs/src/sys/arch/i386/isa/clock.c,v > retrieving revision 1.60 > diff -u -p -r1.60 clock.c > --- arch/i386/isa/clock.c 23 Feb 2021 04:44:30 -0000 1.60 > +++ arch/i386/isa/clock.c 16 Aug 2022 16:09:56 -0000 > @@ -375,6 +375,22 @@ i8254_delay(int n) > } > } > > +void > +i8254_test_delay(int usecs) > +{ > + struct timespec ac, er, ex, t0, t1; > + > + nanouptime(&t0); > + i8254_delay(usecs); > + nanouptime(&t1); > + timespecsub(&t1, &t0, &ac); > + NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex); > + timespecsub(&ac, &ex, &er); > + printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n", > + __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec, > + er.tv_sec, er.tv_nsec); > +} > + > int > calibrate_cyclecounter_ctr(void) > { > Index: dev/acpi/acpitimer.c > =================================================================== > RCS file: /cvs/src/sys/dev/acpi/acpitimer.c,v > retrieving revision 1.15 > diff -u -p -r1.15 acpitimer.c > --- dev/acpi/acpitimer.c 6 Apr 2022 18:59:27 -0000 1.15 > +++ dev/acpi/acpitimer.c 16 Aug 2022 16:09:56 -0000 > @@ -18,6 +18,7 @@ > #include <sys/param.h> > #include <sys/systm.h> > #include <sys/device.h> > +#include <sys/stdint.h> > #include <sys/timetc.h> > > #include <machine/bus.h> > @@ -25,10 +26,13 @@ > #include <dev/acpi/acpireg.h> > #include <dev/acpi/acpivar.h> > > +struct acpitimer_softc; > + > int acpitimermatch(struct device *, void *, void *); > void acpitimerattach(struct device *, struct device *, void *); > - > +void acpitimer_delay(int); > u_int acpi_get_timecount(struct timecounter *tc); > +uint32_t acpitimer_read(struct acpitimer_softc *); > > static struct timecounter acpi_timecounter = { > .tc_get_timecount = acpi_get_timecount, > @@ -56,6 +60,8 @@ struct cfdriver acpitimer_cd = { > NULL, "acpitimer", DV_DULL > }; > > +int acpitimer_attached; > + > int > acpitimermatch(struct device *parent, void *match, void *aux) > { > @@ -98,18 +104,46 @@ acpitimerattach(struct device *parent, s > acpi_timecounter.tc_priv = sc; > acpi_timecounter.tc_name = sc->sc_dev.dv_xname; > tc_init(&acpi_timecounter); > + > +#if defined(__amd64__) || defined(__i386__) > + if (delay_func == i8254_delay) > + delay_func = acpitimer_delay; > +#endif > #if defined(__amd64__) > extern void cpu_recalibrate_tsc(struct timecounter *); > cpu_recalibrate_tsc(&acpi_timecounter); > #endif > + acpitimer_attached = 1; > } > > +void > +acpitimer_delay(int usecs) > +{ > + uint64_t count = 0, cycles; > + struct acpitimer_softc *sc = acpi_timecounter.tc_priv; > + uint32_t mask = acpi_timecounter.tc_counter_mask; > + uint32_t val1, val2; > + > + val2 = acpitimer_read(sc); > + cycles = usecs * acpi_timecounter.tc_frequency / 1000000; > + while (count < cycles) { > + CPU_BUSY_CYCLE(); > + val1 = val2; > + val2 = acpitimer_read(sc); > + count += (val2 - val1) & mask; > + } > +} > > u_int > acpi_get_timecount(struct timecounter *tc) > { > - struct acpitimer_softc *sc = tc->tc_priv; > - u_int u1, u2, u3; > + return acpitimer_read(tc->tc_priv); > +} > + > +uint32_t > +acpitimer_read(struct acpitimer_softc *sc) > +{ > + uint32_t u1, u2, u3; > > u2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, 0); > u3 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, 0); > @@ -120,4 +154,25 @@ acpi_get_timecount(struct timecounter *t > } while (u1 > u2 || u2 > u3); > > return (u2); > +} > + > +void > +acpitimer_test_delay(int usecs) > +{ > + struct timespec ac, er, ex, t0, t1; > + > + if (!acpitimer_attached) { > + printf("%s: (no pmt attached)\n", __func__); > + return; > + } > + > + nanouptime(&t0); > + acpitimer_delay(usecs); > + nanouptime(&t1); > + timespecsub(&t1, &t0, &ac); > + NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex); > + timespecsub(&ac, &ex, &er); > + printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n", > + __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec, > + er.tv_sec, er.tv_nsec); > } > Index: dev/acpi/acpihpet.c > =================================================================== > RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v > retrieving revision 1.26 > diff -u -p -r1.26 acpihpet.c > --- dev/acpi/acpihpet.c 6 Apr 2022 18:59:27 -0000 1.26 > +++ dev/acpi/acpihpet.c 16 Aug 2022 16:09:56 -0000 > @@ -18,6 +18,7 @@ > #include <sys/param.h> > #include <sys/systm.h> > #include <sys/device.h> > +#include <sys/stdint.h> > #include <sys/timetc.h> > > #include <machine/bus.h> > @@ -31,7 +32,7 @@ int acpihpet_attached; > int acpihpet_match(struct device *, void *, void *); > void acpihpet_attach(struct device *, struct device *, void *); > int acpihpet_activate(struct device *, int); > - > +void acpihpet_delay(int); > u_int acpihpet_gettime(struct timecounter *tc); > > uint64_t acpihpet_r(bus_space_tag_t _iot, bus_space_handle_t _ioh, > @@ -84,20 +85,28 @@ struct cfdriver acpihpet_cd = { > uint64_t > acpihpet_r(bus_space_tag_t iot, bus_space_handle_t ioh, bus_size_t ioa) > { > +#ifdef bus_space_read_8 > + return bus_space_read_8(iot, ioh, ioa); > +#else > uint64_t val; > > val = bus_space_read_4(iot, ioh, ioa + 4); > val = val << 32; > val |= bus_space_read_4(iot, ioh, ioa); > return (val); > +#endif > } > > void > acpihpet_w(bus_space_tag_t iot, bus_space_handle_t ioh, bus_size_t ioa, > uint64_t val) > { > +#ifdef bus_space_write_8 > + bus_space_write_8(iot, ioh, ioa, val); > +#else > bus_space_write_4(iot, ioh, ioa + 4, val >> 32); > bus_space_write_4(iot, ioh, ioa, val & 0xffffffff); > +#endif > } > > int > @@ -262,10 +271,20 @@ acpihpet_attach(struct device *parent, s > freq = 1000000000000000ull / period; > printf(": %lld Hz\n", freq); > > - hpet_timecounter.tc_frequency = (uint32_t)freq; > + hpet_timecounter.tc_frequency = freq; > hpet_timecounter.tc_priv = sc; > hpet_timecounter.tc_name = sc->sc_dev.dv_xname; > tc_init(&hpet_timecounter); > + > +#if defined(__amd64__) || defined(__i386__) > + if (delay_func == i8254_delay) > + delay_func = acpihpet_delay; > + /* XXX what if the kernel has no acpitimer support? */ > + extern void acpitimer_delay(int); > + if (delay_func == acpitimer_delay) > + delay_func = acpihpet_delay; > +#endif > + > #if defined(__amd64__) > extern void cpu_recalibrate_tsc(struct timecounter *); > cpu_recalibrate_tsc(&hpet_timecounter); > @@ -273,10 +292,43 @@ acpihpet_attach(struct device *parent, s > acpihpet_attached++; > } > > +void > +acpihpet_delay(int usecs) > +{ > + uint64_t c, s; > + struct acpihpet_softc *sc = hpet_timecounter.tc_priv; > + > + s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); > + c = usecs * hpet_timecounter.tc_frequency / 1000000; > + while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) > + CPU_BUSY_CYCLE(); > +} > + > u_int > acpihpet_gettime(struct timecounter *tc) > { > struct acpihpet_softc *sc = tc->tc_priv; > > return (bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER)); > +} > + > +void > +acpihpet_test_delay(int usecs) > +{ > + struct timespec ac, er, ex, t0, t1; > + > + if (!acpihpet_attached) { > + printf("%s: (no hpet attached)\n", __func__); > + return; > + } > + > + nanouptime(&t0); > + acpihpet_delay(usecs); > + nanouptime(&t1); > + timespecsub(&t1, &t0, &ac); > + NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex); > + timespecsub(&ac, &ex, &er); > + printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n", > + __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec, > + er.tv_sec, er.tv_nsec); > } > >