On Thu, Aug 17, 2017 at 12:19:28PM +0800, Adam Steen wrote:
> On Tue, Aug 8, 2017 at 10:12 PM, Mike Belopuhov <[email protected]> wrote:
> > On Tue, Aug 08, 2017 at 08:18 +0800, Adam Steen wrote:
> >> On Mon, Jul 31, 2017 at 3:58 PM, Mike Belopuhov <[email protected]> wrote:
> >> > On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
> >> >> Ted Unangst wrote:
> >> >> > we don't currently export this info, but we could add some sysctls.
> >> >> > there's
> >> >> > some cpufeatures stuff there, but generally stuff isn't exported until
> >> >> > somebody finds a use for it... it shouldn't be too hard to add
> >> >> > something to
> >> >> > amd64/machdep.c sysctl if you're interested.
> >> >>
> >> >> I am interested, as i need the info, i will look into it and hopefully
> >> >> come back with a patch.
> >> >
> >> > This is a bad idea because TSC as the time source is only usable
> >> > by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
> >> > frequency in the CPUID. All older CPUs have their TSCs measured
> >> > against the PIT. Currently the measurement done by the kernel isn't
> >> > very precise and if TSC is selected as a timecounter, the machine
> >> > would be gaining time on a pace that cannot be corrected by our NTP
> >> > daemon. (IIRC, about an hour a day on my Haswell running with NTP).
> >> >
> >> > To be able to use TSC as a timecounter source on OpenBSD or Solo5
> >> > you'd have to improve the in-kernel measurement of the TSC frequency
> >> > first. I've tried to perform 10 measurements and take an average and
> >> > it does improve accuracy, however I believe we need to poach another
> >> > bit from Linux and re-calibrate TSC via HPET:
> >> >
> >> >
> >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
> >> >
> >> > I think this is the most sane thing we can do. Here's a complete
> >> > procedure that Linux kernel undertakes:
> >> >
> >> >
> >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
> >> >
> >> > Regards,
> >> > Mike
> >>
> >> Hi Mike/All
> >>
> >> I would like to improve the accuracy of TSC frequency calibration as
> >> Mike B. describes above.
> >>
> >> I initially thought the calibration would take place at line 470 of
> >> amd64/identcpu.c
> >> (https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/amd64/identcpu.c?annotate=1.87)
> >>
> >
> > Indeed, it cannot happen there simply because you don't know at
> > that point whether or not HPET actually exists.
> >
> >> But I looked into using the acpihpet directly but it is never exposed
> >> outside of acpihpet.c.
> >>
> >
> > And it shouldn't be.
> >
> >> Could someone point me to were if would be appropriate to complete
> >> this calibration and how to use the acpihpet?
> >
> > The way I envision this is a multi-step approach:
> >
> > 1) TSC frequency is approximated with the PIT (possibly performing
> > multiple measurements and averaging them out; also keep in mind that
> > doing it 8 times means you can shift the sum right by 3 instead of
> > using actual integer division). This is what should happen around
> > the line 470 of identcpu.c
> >
> > 2) A function can be provided by identcpu.c to further adjust the
> > TSC frequency once acpitimer(4) (this is a PM timer) and acpihpet(4)
> > (or any other timer for that matter) are attached.
> >
> > 3) Once acpitimer(4) or acpihpet(4) or any other timecounter source
> > are attached and are verified to be operating correctly, they can
> > perform TSC re-calibration and update the TSC frequency with their
> > measurements. The idea here is that the function (or functions) that
> > facilitate this must abstract enough logic so that you don't have to
> > duplicate it in the acpitimer or acpihpet themselves.
> >
> >> (Will it need to be
> >> exposed like i8254_delay/delay_func/delay in machdep.c and cpu.h)
> >>
> >
> > No it won't.
> >
> >> Lastly should the calibration be done using both delay(i8254 pit) and
> >> hpet timers similar to Linux described above or just using the hpet?
> >>
> >
> > Well, that's what I was arguing for. As I said in my initial mail
> > on misc (not quoted here), the TSC must be calibrated using separate
> > known clocks sources.
>
> Hi Mike
>
> Please see the below diff to improve the accuracy of the TSC
> frequency. It is model after the linux calibration you linked to
> earlier. https://marc.info/?l=openbsd-misc&m=150148792804747&w=2
>
> I feel like i don't know enough about the kernel internals, the
> consistency of the results across reboots are not as close as i would
> have liked, i feel the call to do the actual calibration should be
> later in the boot cycle, when things have calmed down a little, but
> couldn't figure out the best way of doing this.
>
> please bear with me i haven't been programming c for long, but the
> only way to get things done is to do it your self.
>
> Cheers
> Adam
<SNIP>
Thank you Mike on the feedback on the last patch, please see the diff
below, update with your input and style(9)
I have continued to use tsc as my timecounter and /var/db/ntpd.driff
has stayed under 10.
cat /var/db/ntpd.drift
6.665
ntpctl -s all
4/4 peers valid, constraint offset -1s, clock synced, stratum 3
peer
wt tl st next poll offset delay jitter
144.48.166.166 from pool pool.ntp.org
1 10 2 4s 32s -3.159ms 87.723ms 10.389ms
13.55.50.68 from pool pool.ntp.org
1 10 3 11s 32s -3.433ms 86.053ms 18.095ms
14.202.204.182 from pool pool.ntp.org
1 10 1 14s 32s 1.486ms 86.545ms 16.483ms
27.124.125.250 from pool pool.ntp.org
* 1 10 2 12s 30s -10.275ms 54.156ms 70.389ms
Cheers
Adam
Index: sys/arch/amd64/amd64/acpi_machdep.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/acpi_machdep.c,v
retrieving revision 1.78
diff -u -p -u -r1.78 acpi_machdep.c
--- sys/arch/amd64/amd64/acpi_machdep.c 27 Mar 2017 18:32:53 -0000 1.78
+++ sys/arch/amd64/amd64/acpi_machdep.c 23 Aug 2017 11:43:25 -0000
@@ -69,6 +69,17 @@ extern int acpi_savecpu(void) __returns_
u_int8_t *acpi_scan(struct acpi_mem_map *, paddr_t, size_t);
+#define RECALIBRATE_MAX_RETRIES 5
+#define RECALIBRATE_SMI_THRESHOLD 50000
+#define RECALIBRATE_DELAY_THRESHOLD 20
+struct timecounter *acpitimer_recalibrate_timecounter;
+struct timecounter *acpihpet_recalibrate_timecounter;
+uint64_t acpi_calculate_timecounter_delay(struct timecounter *, uint64_t,
+ uint64_t);
+uint64_t acpi_calculate_tsc_frequency(uint64_t, uint64_t, int);
+uint64_t acpi_calibrate_tsc_frequency();
+int acpi_get_tsc_and_timecount(struct timecounter *, uint64_t *, uint64_t *);
+
int
acpi_map(paddr_t pa, size_t len, struct acpi_mem_map *handle)
{
@@ -438,6 +449,123 @@ acpi_resume_cpu(struct acpi_softc *sc)
/* Re-initialise memory range handling on BSP */
if (mem_range_softc.mr_op != NULL)
mem_range_softc.mr_op->initAP(&mem_range_softc);
+}
+
+int
+acpi_get_tsc_and_timecount(struct timecounter *tc, uint64_t *tsc,
+ uint64_t *count)
+{
+ uint64_t n, tsc1, tsc2;
+ int i;
+
+ for (i = 0; i < RECALIBRATE_MAX_RETRIES; i++) {
+ tsc1 = rdtsc();
+ n = (tc->tc_get_timecount(tc) & tc->tc_counter_mask);
+ tsc2 = rdtsc();
+
+ if ((tsc2 - tsc1) < RECALIBRATE_SMI_THRESHOLD) {
+ *count = n;
+ *tsc = tsc2;
+ return (0);
+ }
+ }
+ return 1;
+}
+
+uint64_t
+acpi_calculate_timecounter_delay(struct timecounter *tc, uint64_t count1,
+ uint64_t count2)
+{
+ uint64_t delta;
+
+ if (count2 < count1)
+ count2 += tc->tc_counter_mask;
+
+ delta = (count2 - count1);
+ return delta * 1000000 / tc->tc_frequency;
+}
+
+uint64_t
+acpi_calculate_tsc_frequency(uint64_t tsc1, uint64_t tsc2, int usec)
+{
+ uint64_t delta;
+
+ delta = (tsc2 - tsc1);
+ return delta * 1000000 / usec;
+}
+
+void
+acpitimer_recalibrate_timecounter_init(struct timecounter *tc)
+{
+ acpitimer_recalibrate_timecounter = tc;
+ cpu_adjust_tsc_frequency(&acpi_calibrate_tsc_frequency);
+}
+
+void
+acpihpet_recalibrate_timecounter_init(struct timecounter *tc)
+{
+ acpihpet_recalibrate_timecounter = tc;
+ cpu_adjust_tsc_frequency(&acpi_calibrate_tsc_frequency);
+}
+
+uint64_t
+acpi_calibrate_tsc_frequency()
+{
+ struct timecounter *tc;
+ uint64_t count1, count2, frequency, min_freq, tsc1, tsc2;
+ u_long ef;
+ int delay_usec, i, err1, err2, usec;
+
+ if (acpihpet_recalibrate_timecounter != NULL) {
+ tc = acpihpet_recalibrate_timecounter;
+ } else if (acpitimer_recalibrate_timecounter != NULL) {
+ tc = acpitimer_recalibrate_timecounter;
+ } else {
+ printf("Failed to recalibrate timecounter \"tsc\"");
+ printf(" No timcounter set");
+ return 0;
+ }
+
+ /* warmup the timers */
+ for (i =0; i < 3; i++) {
+ (void)tc->tc_get_timecount(tc);
+ (void)rdtsc();
+ }
+
+ min_freq = ULLONG_MAX;
+
+ delay_usec = 100000;
+ for (i = 0; i < 3; i++) {
+ ef = read_rflags();
+ disable_intr();
+
+ err1 = acpi_get_tsc_and_timecount(tc, &tsc1, &count1);
+ delay(delay_usec);
+ err2 = acpi_get_tsc_and_timecount(tc, &tsc2, &count2);
+
+ write_rflags(ef);
+
+ if (err1 || err2)
+ continue;
+
+ usec = acpi_calculate_timecounter_delay(tc, count1, count2);
+
+ if (usec < (delay_usec - RECALIBRATE_DELAY_THRESHOLD))
+ continue;
+
+ if (usec > (delay_usec + RECALIBRATE_DELAY_THRESHOLD))
+ continue;
+
+ frequency = acpi_calculate_tsc_frequency(tsc1, tsc2, usec);
+
+ min_freq = ulmin(min_freq, frequency);
+ }
+
+ if (min_freq == 0) {
+ printf("Failed to recalibrate timecounter \"tsc\"");
+ printf(" with timecounter \"%s\"\n", tc->tc_name);
+ }
+ return min_freq;
}
#ifdef MULTIPROCESSOR
Index: sys/arch/amd64/amd64/identcpu.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/identcpu.c,v
retrieving revision 1.87
diff -u -p -u -r1.87 identcpu.c
--- sys/arch/amd64/amd64/identcpu.c 20 Jun 2017 05:34:41 -0000 1.87
+++ sys/arch/amd64/amd64/identcpu.c 23 Aug 2017 11:43:26 -0000
@@ -63,6 +63,10 @@ struct timecounter tsc_timecounter = {
tsc_get_timecount, NULL, ~0u, 0, "tsc", -1000, NULL
};
+u_int64_t (*amd64_tsc_get_frequency)();
+u_int64_t amd64_tsc_freq;
+int amd64_tsc_requires_calibration;
+int amd64_has_invariant_tsc;
int amd64_has_xcrypt;
#ifdef CRYPTO
int amd64_has_pclmul;
@@ -428,8 +432,9 @@ cpu_tsc_freq_ctr(struct cpu_info *ci)
u_int64_t
cpu_tsc_freq(struct cpu_info *ci)
{
- u_int64_t last_count, count;
+ u_int64_t last_count, count, sum;
uint32_t eax, ebx, khz, dummy;
+ int i;
if (!strcmp(cpu_vendor, "GenuineIntel") &&
cpuid_level >= 0x15) {
@@ -468,11 +473,21 @@ cpu_tsc_freq(struct cpu_info *ci)
if (count != 0)
return (count);
- last_count = rdtsc();
- delay(100000);
- count = rdtsc();
+ if (amd64_tsc_get_frequency != NULL)
+ return (*amd64_tsc_get_frequency)();
- return ((count - last_count) * 10);
+ if (ci->ci_flags & CPUF_PRIMARY)
+ amd64_tsc_requires_calibration = 1;
+
+ sum = 0;
+ for (i=0; i < 8; i++) {
+ last_count = rdtsc();
+ delay(100000);
+ count = rdtsc();
+ sum += (count - last_count);
+ }
+
+ return ((sum >> 3) * 10);
}
u_int
@@ -482,6 +497,22 @@ tsc_get_timecount(struct timecounter *tc
}
void
+cpu_adjust_tsc_frequency(u_int64_t (*get_frequency)())
+{
+ if (amd64_tsc_requires_calibration) {
+ amd64_tsc_freq = (*get_frequency)();
+ tsc_timecounter.tc_frequency = amd64_tsc_freq;
+
+ printf("Timecounter \"%s\" recalibrated",
+ tsc_timecounter.tc_name);
+ printf("with frequency %llu Hz\n",
+ tsc_timecounter.tc_frequency);
+ } else {
+ amd64_tsc_get_frequency = get_frequency;
+ }
+}
+
+void
identifycpu(struct cpu_info *ci)
{
u_int32_t dummy, val;
@@ -566,9 +597,13 @@ identifycpu(struct cpu_info *ci)
/* Check if it's an invariant TSC */
if (cpu_apmi_edx & CPUIDEDX_ITSC)
ci->ci_flags |= CPUF_INVAR_TSC;
+
+ amd64_has_invariant_tsc = (ci->ci_flags & CPUF_INVAR_TSC) != 0;
}
ci->ci_tsc_freq = cpu_tsc_freq(ci);
+ if (ci->ci_flags & CPUF_PRIMARY)
+ amd64_tsc_freq = ci->ci_tsc_freq;
amd_cpu_cacheinfo(ci);
Index: sys/arch/amd64/amd64/machdep.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/machdep.c,v
retrieving revision 1.231
diff -u -p -u -r1.231 machdep.c
--- sys/arch/amd64/amd64/machdep.c 12 Jul 2017 06:26:32 -0000 1.231
+++ sys/arch/amd64/amd64/machdep.c 23 Aug 2017 11:43:26 -0000
@@ -425,6 +425,8 @@ int
cpu_sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp,
size_t newlen, struct proc *p)
{
+ extern u_int64_t amd64_tsc_freq;
+ extern int amd64_has_invariant_tsc;
extern int amd64_has_xcrypt;
dev_t consdev;
dev_t dev;
@@ -496,6 +498,11 @@ cpu_sysctl(int *name, u_int namelen, voi
pckbc_release_console();
return (error);
#endif
+ case CPU_TSCFREQ:
+ return (sysctl_rdquad(oldp, oldlenp, newp, amd64_tsc_freq));
+ case CPU_INVARIANTTSC:
+ return (sysctl_rdint(oldp, oldlenp, newp,
+ amd64_has_invariant_tsc));
default:
return (EOPNOTSUPP);
}
Index: sys/arch/amd64/include/cpu.h
===================================================================
RCS file: /cvs/src/sys/arch/amd64/include/cpu.h,v
retrieving revision 1.114
diff -u -p -u -r1.114 cpu.h
--- sys/arch/amd64/include/cpu.h 11 Aug 2017 20:19:14 -0000 1.114
+++ sys/arch/amd64/include/cpu.h 23 Aug 2017 11:43:28 -0000
@@ -427,7 +427,9 @@ void mp_setperf_init(void);
#define CPU_XCRYPT 12 /* supports VIA xcrypt in userland */
#define CPU_LIDACTION 14 /* action caused by lid close */
#define CPU_FORCEUKBD 15 /* Force ukbd(4) as console keyboard */
-#define CPU_MAXID 16 /* number of valid machdep ids */
+#define CPU_TSCFREQ 16 /* tsc frequency */
+#define CPU_INVARIANTTSC 17 /* has invariant tsc */
+#define CPU_MAXID 18 /* number of valid machdep ids */
#define CTL_MACHDEP_NAMES { \
{ 0, 0 }, \
@@ -446,6 +448,8 @@ void mp_setperf_init(void);
{ 0, 0 }, \
{ "lidaction", CTLTYPE_INT }, \
{ "forceukbd", CTLTYPE_INT }, \
+ { "tscfreq", CTLTYPE_QUAD }, \
+ { "invarianttsc", CTLTYPE_INT }, \
}
/*
Index: sys/arch/amd64/include/cpuvar.h
===================================================================
RCS file: /cvs/src/sys/arch/amd64/include/cpuvar.h,v
retrieving revision 1.8
diff -u -p -u -r1.8 cpuvar.h
--- sys/arch/amd64/include/cpuvar.h 28 Jul 2016 21:57:57 -0000 1.8
+++ sys/arch/amd64/include/cpuvar.h 23 Aug 2017 11:43:28 -0000
@@ -96,5 +96,6 @@ void x86_ipi_init(int);
void identifycpu(struct cpu_info *);
void cpu_init(struct cpu_info *);
void cpu_init_first(void);
+void cpu_adjust_tsc_frequency(uint64_t (*)());
#endif
Index: sys/dev/acpi/acpihpet.c
===================================================================
RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
retrieving revision 1.21
diff -u -p -u -r1.21 acpihpet.c
--- sys/dev/acpi/acpihpet.c 6 Oct 2015 20:49:32 -0000 1.21
+++ sys/dev/acpi/acpihpet.c 23 Aug 2017 11:43:30 -0000
@@ -265,6 +265,7 @@ acpihpet_attach(struct device *parent, s
hpet_timecounter.tc_name = sc->sc_dev.dv_xname;
tc_init(&hpet_timecounter);
acpihpet_attached++;
+ acpihpet_recalibrate_timecounter_init(&hpet_timecounter);
}
u_int
Index: sys/dev/acpi/acpitimer.c
===================================================================
RCS file: /cvs/src/sys/dev/acpi/acpitimer.c,v
retrieving revision 1.11
diff -u -p -u -r1.11 acpitimer.c
--- sys/dev/acpi/acpitimer.c 14 Mar 2015 03:38:46 -0000 1.11
+++ sys/dev/acpi/acpitimer.c 23 Aug 2017 11:43:30 -0000
@@ -96,6 +96,7 @@ acpitimerattach(struct device *parent, s
acpi_timecounter.tc_priv = sc;
acpi_timecounter.tc_name = sc->sc_dev.dv_xname;
tc_init(&acpi_timecounter);
+ acpitimer_recalibrate_timecounter_init(&acpi_timecounter);
}
Index: sys/dev/acpi/acpivar.h
===================================================================
RCS file: /cvs/src/sys/dev/acpi/acpivar.h,v
retrieving revision 1.88
diff -u -p -u -r1.88 acpivar.h
--- sys/dev/acpi/acpivar.h 17 Aug 2017 05:16:27 -0000 1.88
+++ sys/dev/acpi/acpivar.h 23 Aug 2017 11:43:30 -0000
@@ -25,6 +25,7 @@
#include <sys/timeout.h>
#include <sys/rwlock.h>
+#include <sys/timetc.h>
#include <machine/biosvar.h>
#include "acpipwrres.h"
@@ -324,6 +325,9 @@ void acpi_resume_cpu(struct acpi_softc
void acpi_resume_mp(void);
void acpi_sleep_walk(struct acpi_softc *, int);
+void acpihpet_recalibrate_timecounter_init(struct timecounter *);
+void acpitimer_recalibrate_timecounter_init(struct timecounter *);
+uint64_t acpi_calibrate_tsc_frequency();
#define ACPI_IOREAD 0
#define ACPI_IOWRITE 1