Whoops, forgot about the split read problem. My mistake. Because 32-bit platforms cannot do bus_space_read_8 atomically, and i386 can use acpihpet(4), we can only safely use the lower 32 bits of the counter in acpihpet_delay() (unless we want two versions of acpihpet_delay()... which I don't).
Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles as we do in acpihpet_delay(). Unlike acpitimer(4), the HPET is a 64-bit counter so we don't need to mask the difference between val1 and val2. ok? Index: acpihpet.c =================================================================== RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v retrieving revision 1.28 diff -u -p -r1.28 acpihpet.c --- acpihpet.c 25 Aug 2022 18:01:54 -0000 1.28 +++ acpihpet.c 28 Aug 2022 02:26:07 -0000 @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s void acpihpet_delay(int usecs) { - uint64_t c, s; + uint64_t count = 0, cycles; struct acpihpet_softc *sc = hpet_timecounter.tc_priv; + uint32_t val1, val2; - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); - c = usecs * hpet_timecounter.tc_frequency / 1000000; - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); + cycles = usecs * hpet_timecounter.tc_frequency / 1000000; + while (count < cycles) { CPU_BUSY_CYCLE(); + val1 = val2; + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, + HPET_MAIN_COUNTER); + count += val2 - val1; + } } u_int