> From: Scott Cheloha <scottchel...@gmail.com> > Date: Thu, 25 Mar 2021 13:18:04 -0500 > > > On Mar 24, 2021, at 8:29 AM, Josh Rickmar <joshrick...@outlook.com> wrote: > > > > On Wed, Mar 24, 2021 at 05:40:21PM +0900, YASUOKA Masahiko wrote: > >> Hi, > >> > >> I hit a problem which is caused by going back of monotonic time. It > >> happens on hosts on VMware ESXi. > >> > >> I wrote the program which repeats the problem. > >> > >> % cc -o monotime monotime.c -lpthread > >> % ./monotime > >> 194964 Starting > >> 562210 Starting > >> 483046 Starting > >> 148865 Starting > >> 148865 Back 991.808048665 => 991.007447931 > >> 562210 Back 991.808048885 => 991.007448224 > >> 483046 Back 991.808049115 => 991.007449172 > >> 148865 Stopped > >> 562210 Stopped > >> 483046 Stopped > >> 194964 Stopped > >> % uname -a > >> OpenBSD yasuoka-ob-c.tokyo.iiji.jp 6.8 GENERIC.MP#5 amd64 > >> % sysctl kern.version > >> kern.version=OpenBSD 6.8 (GENERIC.MP) #5: Mon Feb 22 04:36:10 MST 2021 > >> > >> r...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > >> % > >> > >> monotime.c > >> ---- > >> #include <sys/types.h> > >> #include <sys/time.h> > >> #include <stdio.h> > >> #include <time.h> > >> #include <unistd.h> > >> #include <pthread.h> > >> #include <stdlib.h> > >> > >> #define NTHREAD 4 > >> #define NTRY 50000 > >> > >> void * > >> start(void *dummy) > >> { > >> int i; > >> struct timespec ts0, ts1; > >> > >> printf("%d Starting\n", (int)getthrid()); > >> clock_gettime(CLOCK_MONOTONIC, &ts0); > >> > >> for (i = 0; i < NTRY; i++) { > >> clock_gettime(CLOCK_MONOTONIC, &ts1); > >> if (timespeccmp(&ts0, &ts1, <=)) { > >> ts0 = ts1; > >> continue; > >> } > >> printf("%d Back %lld.%09lu => %lld.%09lu\n", > >> (int)getthrid(), ts0.tv_sec, ts0.tv_nsec, ts1.tv_sec, > >> ts1.tv_nsec); > >> break; > >> } > >> printf("%d Stopped\n", (int)getthrid()); > >> > >> return (NULL); > >> } > >> > >> int > >> main(int argc, char *argv[]) > >> { > >> int i, n = NTHREAD; > >> pthread_t *threads; > >> > >> threads = calloc(n, sizeof(pthread_t)); > >> > >> for (i = 0; i < n; i++) > >> pthread_create(&threads[i], NULL, start, NULL); > >> for (i = 0; i < n; i++) > >> pthread_join(threads[i], NULL); > >> > >> } > >> ---- > >> > >> The machine has 4 vCPUs and showing the following message on boot. > >> > >> cpu1: disabling user TSC (skew=-5310) > >> cpu2: disabling user TSC (skew=-5335) > >> cpu3: disabling user TSC (skew=-7386) > >> > >> This means "user TSC" is disabled because of TSC of cpu{1,2,3} is much > >> delayed against cpu0. > >> > >> Simply ignoring the skews by the following diff seems to workaround > >> this problem. > >> > >> diff --git a/sys/arch/amd64/amd64/tsc.c b/sys/arch/amd64/amd64/tsc.c > >> index 238a5a068e1..3b951a8b5a3 100644 > >> --- a/sys/arch/amd64/amd64/tsc.c > >> +++ b/sys/arch/amd64/amd64/tsc.c > >> @@ -212,7 +212,8 @@ cpu_recalibrate_tsc(struct timecounter *tc) > >> u_int > >> tsc_get_timecount(struct timecounter *tc) > >> { > >> - return rdtsc_lfence() + curcpu()->ci_tsc_skew; > >> + //return rdtsc_lfence() + curcpu()->ci_tsc_skew; > >> + return rdtsc_lfence(); > >> } > >> > >> void > >> > >> So I supposed the skews are not calculated properly. Also I found > >> NetBSD changed the skew calculating so that it checks 1000 times and > >> take the minimum value. > >> > >> > >> https://github.com/NetBSD/src/commit/1dec05c1ae197b4acfc7038e49dfddabcbed0dff > >> > >> https://github.com/NetBSD/src/commit/66d76b89792bac1c71cd5507ba62b08ad02129ef > >> > >> > >> I checked skews on the machine by the following debug code. > >> > >> diff --git a/sys/arch/amd64/amd64/tsc.c b/sys/arch/amd64/amd64/tsc.c > >> index 238a5a068e1..83e835e4f82 100644 > >> --- a/sys/arch/amd64/amd64/tsc.c > >> +++ b/sys/arch/amd64/amd64/tsc.c > >> @@ -302,16 +302,42 @@ tsc_read_bp(struct cpu_info *ci, uint64_t *bptscp, > >> uint64_t *aptscp) > >> *aptscp = tsc_sync_val; > >> } > >> > >> +#define TSC_SYNC_NTIMES 1000 > >> + > >> +static int tsc_difs[MAXCPUS][TSC_SYNC_NTIMES]; > >> + > >> +void > >> +tsc_debug(void) > >> +{ > >> + int i, cpuid = curcpu()->ci_cpuid; > >> + > >> + for (i = 0; i < TSC_SYNC_NTIMES; i++) { > >> + if (i % 10 == 0) > >> + printf("%5d", tsc_difs[cpuid][i]); > >> + else > >> + printf(" %5d", tsc_difs[cpuid][i]); > >> + if (i % 10 == 9) > >> + printf("\n"); > >> + } > >> + printf("\n"); > >> +} > >> + > >> void > >> tsc_sync_bp(struct cpu_info *ci) > >> { > >> + int i, mindif = INT_MAX, dif; > >> uint64_t bptsc, aptsc; > >> > >> - tsc_read_bp(ci, &bptsc, &aptsc); /* discarded - cache effects */ > >> - tsc_read_bp(ci, &bptsc, &aptsc); > >> + for (i = 0; i < TSC_SYNC_NTIMES; i++) { > >> + tsc_read_bp(ci, &bptsc, &aptsc); > >> + dif = bptsc - aptsc; > >> + if (abs(dif) < abs(mindif)) > >> + mindif = dif; > >> + tsc_difs[ci->ci_cpuid][i] = dif; > >> + } > >> > >> /* Compute final value to adjust for skew. */ > >> - ci->ci_tsc_skew = bptsc - aptsc; > >> + ci->ci_tsc_skew = mindif; > >> } > >> > >> /* > >> @@ -342,8 +368,10 @@ tsc_post_ap(struct cpu_info *ci) > >> void > >> tsc_sync_ap(struct cpu_info *ci) > >> { > >> - tsc_post_ap(ci); > >> - tsc_post_ap(ci); > >> + int i; > >> + > >> + for (i = 0; i < TSC_SYNC_NTIMES; i++) > >> + tsc_post_ap(ci); > >> } > >> > >> void > >> > >> ---- > >> Stopped at db_enter+0x10: popq %rbp > >> ddb{0}> machine ddbcpu 1 > >> Stopped at x86_ipi_db+0x12: leave > >> ddb{1}> call tsc_debug > >> -8445 -6643 -52183 0 -3 -4 -7 -11 -5 0 > >> -11 -9 -5 -3 -4 -3 -7 8 -5 -6 > >> -5 -9 -3 -9 -7 -1 -5 -5 -9 -2 > >> -6 -4 -6 -4 -11 -8 -3 -4 -8 -1 > >> -9 -1 -8 1 -8 6 -5 -4 2 -2 > >> -8 -3 -1 -5 -2 -2 1 2 -2 -9 > >> -12 0 -9 -2 -2 -5 -2 1 2 0 > >> -1 2 -2 6 -5 -1 -2 -4 2 -2 > >> 0 -9 -9 -6 -2 2 3 -6 -1 3 > >> 8 4 -2 2 -8 7 1 2 -2 1 > >> -2 -6 -2 5 0 0 -4 -9 6 -2 > >> -3 -6 -2 -12 1 -9 -2 -3 -10 10 > >> 2 -1 -3 -2 3 1 1 -5 3 -3 > >> -5 1 -6 -2 -3 0 0 9 1 6 > >> 8 -6 5 4 -12 -1 4 2 -1 -1 > >> -1 2 2 0 -5 1 2 -8 3 9 > >> 0 6 -3 4 6 0 8 6 -14 -1 > >> -1 0 -6 -7 6 -10 7 -6 -5 -4 > >> 6 -12 4 3 -5 5 1 -6 3 0 > >> -2 0 6 -9 -2 -1 1 -1 4 0 > >> 4 10 -13 1 -8 -2 -8 -3 -5 -3 > >> -5 -5 1 -9 -9 0 -3 -1 2 6 > >> -2 2 -3 -9 -9 -11 -7 -6 -4 -9 > >> -4 -9 -3 -4 0 -5 0 -9 -12 -7 > >> -6 -9 1 -5 -4 -12 7 -3 -12 -4 > >> -5 -5 -6 -9 -7 -1 0 0 -1 -2 > >> -6 -8 0 1 -8 -5 -2 -4 0 -1 > >> -3 -10 -15 -3 -8 -11 -9 -9 2 0 > >> -2 -4 -2 -3 -13 -9 -9 -1 -10 -6 > >> 0 0 2 -2 -4 1 -6 0 0 -5 > >> -2 -7 -5 -2 -2 1 -2 -6 -1 -7 > >> -6 -1 -9 -3 -2 -1 -4 -6 -3 -4 > >> -4 -3 -4 -11 1 -9 0 -3 2 -9 > >> -8 2 -1 -7 -5 -5 -9 2 -3 -5 > >> 0 5 -12 0 -5 -3 -6 -1 -13 -10 > >> -9 0 0 -5 -7 -4 -3 -3 -3 -2 > >> -2 3 -5 -3 -5 -1 -7 -4 -10 0 > >> -3 0 2 1 -4 -1 -5 -3 -5 -6 > >> -4 -8 -3 0 -1 -2 -13 -10 -9 -5 > >> -11 -11 -4 -3 0 5 -2 -3 -6 0 > >> -6 -9 -1 -4 -1 2 -2 -7 -9 0 > >> -8 -4 -6 -5 -12 -9 -5 -11 -5 -8 > >> -8 -6 -2 -3 -9 -5 -9 -11 -10 1 > >> -3 -6 -1 -1 -6 0 0 -8 -4 0 > >> -3 -10 -4 -2 -3 -2 -1 -9 -11 -12 > >> -4 2 -2 -5012 5 2 -17 0 7 -5 > >> 0 -4 -3 6 -7 -1 -1 4 -6 3 > >> 0 -4 -9 -7 -11 -11 8 -7 -15 -10 > >> 3 -4 1 -17 -4 3 -17 0 4 3 > >> -2 0 -3 -10 -2 1 3 -5 -12 -19 > >> 1 2 5 1 -9 4 -2 -3 -4 0 > >> -1 -11 -3 -1 -9 -5 0 -8 7 -2 > >> -6 -7 4 -5 -2 -1 -5 0 -5 -5 > >> -14 -2 -8 0 -11 9 -10 2 -6 -17 > >> -3 -5 -6 2 0 9 -14 0 -4 -7 > >> 6 2 2 -9 -9 5 5 0 -6 3 > >> -12 5 -2 -13 -10 -5 -7 2 -11 -3 > >> -6 -2 -13 1 8 -5 -14 2 4 -3 > >> -13 -5 -11 -9 -10 4 -3 -1 9 -17 > >> -11 2 -13 -2 -1 -9 -10 0 -5 -4 > >> 0 10 -8 -5 -8 -3 -14 -6 3 -15 > >> 1 -5 1 176 -8 -7 -7 -4 -1 -1 > >> -8 -7 -4 1 -6 -9 1 -2 -9 -4 > >> -4 -1 -7 0 -8 -3 -4 -3 -1 -2 > >> -5 -6 -9 2 -6 0 -8 -5 0 -9 > >> -10 0 -4 4 -6 -11 -3 2 -12 1 > >> -2 -6 -6 -3 -7 -7 0 -9 -1 -9 > >> -1 -8 -4 -3 -11 1 0 1 -2 -4 > >> -11 -1 -9 -9 -10 -1 -1 -9 -8 -6 > >> -3 -4 -2 -4 1 0 -5 -2 -1 4 > >> -9 -1 -4 1 -8 -11 0 -10 -4 -9 > >> -5 -2 -2 4 0 -7 -4 1 -2 1 > >> -4 -1 -5 -9 -9 -5 -10 -4 -12 -8 > >> -4 -9 -7 -5 -3 3 -5 -12 -3 0 > >> -8 -4 -9 -5 -6 0 0 -1 -2 -6 > >> -8 -12 -3 1 2 -6 -1 -7 -10 -9 > >> -6 -8 0 -2 -3 -7 -3 -2 6 -3 > >> -12 0 0 -7 -9 -6 -1 -5 -2 -9 > >> -6 1 0 -3 -1 -1 -2 2 -2 -3 > >> -7 -9 -1 -8 -4 -2 5 -5 -3 -10 > >> 2 6 -3 0 -6 -8 -9 -1 -1 -7 > >> -8 -1 -1 -4 -4 7 -2 -10 -11 -6 > >> 2 2 -4 3 -2 -1 -3 0 0 -7 > >> -1 -3 -4 -9 -5 -2 -5 -7 -5 -3 > >> 0 1 -3 5 -3 -4 -1 -6 -9 -4 > >> -6 0 -9 -6 0 -2 4 -2 -4 -10 > >> -9 -4 -3 -9 -3 -6 -9 -8 -4 1 > >> -5 -6 -5 1 0 -2 -3 -6 -5 -9 > >> -4 1 -5 -4 -2 -4 -8 -3 -4 0 > >> 2 -5 -3 -7 -1 -2 -1 -9 -6 -15 > >> -10 -6 -2 -7 -1 -3 3 -6 -6 -9 > >> -10 -8 -9 -2 -3 0 -6 3 -4 4 > >> 3 3 8 -2 -2 -4 0 -3 -9 -3 > >> -6 -4 3 2 1 1 -3 -7 -15 -1 > >> -4 -6 -1 -2 -1 -12 2 -1 -4 1 > >> 2 3 -5 3 -3 -7 -6 -5 0 1 > >> 5 -13 -8 0 -5 2 0 -5 -3 6 > >> -4 -9 -2 -8 -1 -9 -10 -1 -10 -6 > >> -10 -4 -10 -9 -2 1 0 -4 -3 0 > >> 1 -3 -1 -4 -7 -10 -13 -8 -1 -1 > >> > >> 0x1 > >> ddb{1}> machine ddbcpu 2 > >> Stopped at x86_ipi_db+0x12: leave > >> ddb{2}> call tsc_debug > >> -8242 -6496 -50265 -1 -2 -2 1 109 -2 -3 > >> 3 3 -8 -3 -4 4 0 -8 -7 -5 > >> -5 3 -7 -5 -4 -9 -7 -3 0 2 > >> -5 2 1 3 -2 3 8 -6 -11 8 > >> 8 -5 1 5 0 -8 2 0 6 3 > >> -14 7 -2 -1 -3 1 -5 -6 0 5 > >> 1 -1 0 -2 -5 2 -3 0 -3 -1 > >> -5 -12 -4 -4 -9 4 0 -2 -2 -8 > >> 2 5 7 -2 0 -6 110 -8 -8 -4 > >> 0 5 -7 -3 -5 -4 9 -2 -2 3 > >> -8 -2 -5 4 -3 217 0 -7 -7 -6 > >> 7 -10 -9 -3 3 -14 3 -5 5 -12 > >> 5 -8 -17 -5 286 -6 0 -3 -4 -2 > >> 1 -5 -5 -9 -6 -7 -3 -5 6 0 > >> 1 -1 -4 -2 2 -2 -2 -2 -5 2 > >> 12 3 -18 -8 6 -4 -3 6 2 -3 > >> -7 -3 4 -5 -23 9 6 6 -6 -11 > >> 9 -1 -10 50505 -1 2 6 -11 2 -2 > >> -4 -6 201 1 3 4 -9 6 0 1 > >> -4 0 -1 3 4 1 6 -7 -5 4 > >> -14 -3 -1 -8 5 6 -5 3 -7 -9 > >> -7 1 -2 5 -2 0 -2 -9 4 -3 > >> 98 -5 7 -7 3 0 -5 0 9 2 > >> -7 -5 -3 -12 -11 -11 6 -5 -7 -6 > >> 210 5 -3 -5 -4 -11 -6 0 -5 -9 > >> 3 0 -9 5 1 0 0 -7 -5 210 > >> 1 -6 -17 -8 0 1 -2 -8 1 -7 > >> 10 -8 -8 -9 4 -2 -4 -3 204 5 > >> -9 -15 3 -1 -5 0 -12 -1 0 1 > >> -1 -6 -5 -9 -1 4 -1 -1 0 1 > >> 4 -8 -13 1 0 -5 -6 0 -4 0 > >> 6 -2 -4 -8 -7 -12 -2 -2 -6 -8 > >> -5 0 -8 -7 -11 0 6 -1 -8 -3 > >> 2 6 -6 0 -10 285 -1 -2 -8 -6 > >> -6 -1 -5 -6 0 -5 -8 -5 1 -8 > >> -1 1 4 -3 -4 188 -3 -3 -10 5 > >> 6 0 -7 4 1 2 0 -2 -2 2 > >> -3 -1 -9 -12 201 -1 -7 -1 8 0 > >> 0 0 -5 0 7 -18 5 1 -2 15 > >> 5 -6 4 -10 272 0 -4 -3 2 -10 > >> -7 7 3 4 1 6 -9 -8 -12 -2 > >> -2 -2 -6 8 3 97 -1 -7 5 1 > >> -7 4 -6 -9 -7 -2 0 -6 8 -13 > >> -1 3 -9 -4 233 -2 0 0 -5 -5 > >> -2 0 0 5 -7 6 -14 4 -6 5 > >> 4 3 3 -3 103 -2 -6 -11 2 -2 > >> -3 -8 0 -1 1 0 -6 1 1 -10 > >> -1 0 -1 0 0 -1 -6 -4 -4 -3 > >> 3 0 -5 0 2 -2 6 0 -13 3 > >> 1 -16 -2 -12 206 -4 -6 -5 -3 2 > >> -8 -2 -9 -2 9 7 -1 1 3 -5 > >> 6 6 10 0 99 -2 -1 -2 0 -6 > >> -8 -9 0 3 0 -11 -6 1 -4 9 > >> -11 -11 2 -1 276 -7 2 4 -4 -3 > >> 4 2 -12 8 1 -4 0 -1 -1 -1 > >> 5 -8 0 6 232 -4 2 -2 -11 2 > >> -6 0 4 6 -7 8 -6 3 -3 -9 > >> 0 1 -7 2 -3 -5 -3 -2 -8 -4 > >> -14 -2 0 4 -4 -3 -4 9 3 -1 > >> -5 -8 1 0 210 0 -6 -4 2 2 > >> 0 -3 -4 0 4 -12 10 -2 -3 -1 > >> 3 7 -3 6 -9 190 4 6 -2 11 > >> -10 -2 -9 3 2 4 -4 -7 -9 -2 > >> -10 12 1 -7 3 -1 0 -1 -11 6 > >> 6 0 -8 1 4 2 -2 -5 0 0 > >> 5 -5 -3 -5 205 -1 -3 -1 0 -4 > >> -3 -7 3 0 -7 -5 -4 -3 7 1 > >> -14 -2 0 -14 2 0 7 11 -2 -7 > >> 6 -6 -3 -4 0 4 -3 -4 -3 -7 > >> -6 3 -4 0 4 -3 -6 3 4 -4 > >> -6 7 -11 1 -10 -10 -12 -4 -3 6 > >> -5 -6 5 212 -8 1 -4 3 -3 -8 > >> 0 -10 0 2 6 -3 -1 -13 1 6 > >> -6 5 -4 0 -7 -4 -5 -6 1 -2 > >> 1 -10 -12 -3 -8 9 -11 0 1 -3 > >> -1 1 186 2 -12 3 -4 0 0 -1 > >> -5 -4 1 -1 4 5 -10 0 -4 1 > >> -1 -6 270 0 -17 5 0 0 -4 -1 > >> 4 6 1 0 -6 6 5 -3 3 -2 > >> -6 -7 18 -16 0 9 4 3 -1 6 > >> 4 1 -5 7 -11 -1 -6 -2 6 -2 > >> -8 256 -5 -2 1 -1 -5 -3 -9 0 > >> -1 3 -2 -11 4 3 -7 -3 3 -1 > >> -5 -5 0 -4 -1 -3 -3 6 -3 3 > >> 3 -2 -11 -2 -3 11 -8 1 5 -7 > >> 5 -12 -3 8 -9 -5 7 5 -3 2 > >> -7 -3 -6 2 -2 -1 -2 -9 -8 99 > >> 6 -2 9 -1 -4 -2 0 -2 -7 -5 > >> 0 1 -4 -8 -1 -2 2 -8 2 205 > >> -3 -10 -1 -1 -2 -5 2 2 5 -2 > >> -5 -6 -4 6 6 -6 4 1 -5 0 > >> -7 3 1 0 -11 -7 -3 5 -5 5 > >> 1 -1 3 -5 -8 0 -1 0 183 -5 > >> 0 4 -1 -6 -11 -10 1 -18 3 -1 > >> -5 -9 -2 2 2 -2 0 99 -7 -8 > >> -1 -3 -5 -1 12 -3 -1 2 1 4 > >> 7 3 -14 2 -4 8 -9 -3 -8 -5 > >> 6 -6 5 -12 6 -1 -9 -4 -4 1 > >> -6 0 0 -2 -3 -5 -9 -2 -9 -3 > >> 3 -16 -2 -1 0 9 -4 5 -6 5 > >> > >> 0x1 > >> ddb{2}> machine ddbcpu 3 > >> Stopped at x86_ipi_db+0x12: leave > >> ddb{3}> call tsc_debug > >> -8336 -6457 -45527 0 -1 0 -2 4 0 5 > >> -9 -6 -4 -4 6 4 0 -3 -4 5 > >> 3 6 -12 -1 1 1 -3 6 -4 -2 > >> -2 2 4 -3 0 -1 3 0 1 3 > >> -3 3 -1 4 -5 -2 -2 -9 6 -9 > >> 0 1 -2 -6 8 -4 -2 -2 6 1 > >> -1 1 -6 -6 4 -5 -1 6 -1 -1 > >> 3 5 0 -6 5 -4 -2 -6 -3 -4 > >> -5 2 -3 -3 -5 -3 -5 -5720 0 5 > >> -1 -3 3 -2 4 -6 8 -16 -6 -3 > >> -2 4 8 -3 3 -1 2 -8 -3 -19 > >> -8 3 -7 -9 -6 -3 1 -3 -6 7 > >> 4 1 3 2 -4 7 -3 2 2 -4 > >> -10 -8 -14 -2 2 -3 3 -2 3 -3 > >> 5 5 -6 9 -3 -12 -6 1 -10 7 > >> -6 5 -4 1 6 3 2 2 -6 -1 > >> 6 -8 -5 7 3 -3 -4 -6 1 -5 > >> 4 -6 2 6 -6 3 -8 -5 -6 0 > >> -5 -2 -13 -8 3 0 -17 -7 -9 1 > >> 6 -12 3 6 3 -3 4 -1 -7 0 > >> 2 0 -7 -10 -6 3 1 2 -19 4 > >> 1 -18 1 -3 -6 -14 4 6 3 -4 > >> -7 -11 -1 -1 3 0 6 6 -8 -14 > >> 3 -2 6 -5 0 1 0 -7 0 4 > >> -3 -16 -2 -4 -12 -4 6 0 -8 -4 > >> -4 -4 -3 0 -6 -13 -10 -15 -6 2 > >> 0 -3 0 -8 4 -1 -1 5 -4 -1 > >> -7 2 1 1 3 -3 -1 -18 6 8 > >> -2 3 -6 0 -6 -2 2 -2 -7 1 > >> -6 -13 -4 -2 -1 -6 6 -5 -9 -14 > >> 4 5 -4 -2 -9 -2 -2 -13 1 -18 > >> -1 2 -5 6 -6 -7 -9 -6 1 6 > >> -13 -4 3 0 -8 -6 -6 -10 -2 -9 > >> 5 2 4 1 6 -5 -8 -6 2 -4 > >> -3 -5 -2 -6 -10 0 5 -2 8 -3 > >> 8 -11 -11 -7 -5 -13 -19 -5 -14 -3 > >> 3 2 1 3 6 -1 -9 -16 3 -7 > >> 1 -3 -5 0 -7 6 4 -2 -2 4 > >> 4 -6 -6 8 -6 -7 -5 -2 0 4 > >> -1 3 1 3 -6 -2 -4 -9 1 0 > >> -2 3 2 -16 4 -15 -11 -3 8 0 > >> -6 -3 -18 -7 -8 8 -8 6 -7 -4 > >> -8 -7 -9 0 -2 3 7 -2 1 -6 > >> 2 -6 8 -1 -12 -8 -4 3 -13 -2 > >> -11 1 -2 -7 -3 0 -16 -8 4 -9 > >> -15 -8 -9 8 5 7 -9 5 -10 10 > >> 1 6 -6 3 -6 -4 5 0 -3 -7 > >> -1 -4 -10 -2 0 -11 8 -8 -3 -11 > >> 4 -6 1 -8 -1 -6 3 -1 -12 -7 > >> 11 -1 0 -4 -13 0 -10 7 -2 0 > >> -6 4 0 -12 -9 6 -2 -5 -5 -7 > >> -1 -9 -5 3 3 -7 3 -16 -2 -10 > >> 0 -2 6 4 4 -6 -9 -3 -6 4 > >> -5 -1 7 -11 -21 -9 -3 -1 4 -13 > >> -7 0 -3 0 -10 -7 8 -9 7 6 > >> -9 -14 5 -9 -7 -8 11 -6 1 -3 > >> -9 7 5 0 -12 -3 -4 -18 0 4 > >> -1 8 -8 9 4 0 5 -10 -10 -8 > >> 1 8 -1 4 1 -3 -6 -6 -5 -1 > >> -12 -6 -8 -14 -10 2 -1 9 0 -9 > >> 0 -8 -6 -5 1 -14 -6 5 1 -7 > >> 3 -9 6 -4 3 0 -21 8 7 5 > >> 0 0 -2 2 1 -7 -11 1 -7 2 > >> -3 -2 9 1 3 -1 -8 0 -18 -7 > >> -3 6 1 1 -8 -5 -1 2 -4 -6 > >> -10 -5 -20 -18 5 -7 2 5 -3 -3 > >> -6 -3 -3 4 -7 -2 -4 2 -6 -11 > >> -4 -15 -6 -5 2 -1 -10 -4 -6 -3 > >> -1 0 -6 -10 5 -5 2 2 -8 6 > >> -1 5 -5 -9 12 0 -6 -17 2 -4 > >> 4 -11 -1 -8 0 -3 -11 -8 0 -6 > >> 2 -1 -4 -9 -12 3 6 -10 -2 -6 > >> -13 1 -10 1 -13 5 -4 3 -11 4 > >> -8 4 4 -17 6 -12 3 -13 0 -3 > >> -7 4 -11 -1 1 3 5 -8 -3 0 > >> 4 -8 -1 4 -13 2 -1 2 5 -6 > >> -6 4 -4 -5 1 -4 4 5 -2 -10 > >> -9 0 -14 -2 -7 1 -1 -14 -15 8 > >> -3 9 -3 -4 -8 0 2 5 -11 -13 > >> -5 -13 -3 -7 9 -3 -7 -6 -9 -7 > >> -8 -10 -1 6 2 2 3 -13 -6 -5 > >> 3 -13 2 -6 4 1 0 -9 4 -11 > >> -15 0 1 -12 -1 2 -11 2 -1 -13 > >> -13 -5 3 9 -11 -8 2 -12 -2 -5 > >> 0 -3 -2 1 -6 -2 -1 0 -3 2 > >> 0 -9 -4 1 -2 3 -6 0 -4 -6 > >> 2 1 -14 -4 -7 -23 -1 -4 -8 -8 > >> 2 -4 1 6 -10 5 -9 -6 6 -1 > >> 4 -2 -9 0 -5 -1 8 0 3 4 > >> -4 -7 -4 2 -15 -6 2 1 -6 -4 > >> -3 -3 3 -7 -2 -9 9 -2 -9 -4 > >> 3 -8 -10 -3 -4 1 -4 -10 -1 3 > >> -1 -15 5 0 2 -6 4 6 -2 0 > >> 2 5 3 -12 -5 -5 -2 -4 -4 5 > >> -8 -4 -6 -16 -1 4 -4 -1 -8 0 > >> 6 -10 5 0 1 -3 -9 8 0 -11 > >> -3 -13 -1 3 3 -6 -6 -2 3 5 > >> 4 -6 12 -17 1 -5 -5 -2 1 -3 > >> -7 5 -4 8 -5 -1 -5 8 -12 4 > >> -6 2 -5 -9 2 -2 5 3 -5 -8 > >> > >> 0x1 > >> ddb{3}> > >> ---- > >> > >> I seems that we can fix the problem by comparing TSC more times and > >> choosing the minimum value. > >> > >> The diff also includes a fix for the problem (use the minimum value of > >> 1000 samplings). With the diff, I tried "monotime" test program over > >> 100 times, it didn't find any clock going back. > >> > >> > >> dmesg > >> ---- > >> OpenBSD 6.8 (GENERIC.MP) #5: Mon Feb 22 04:36:10 MST 2021 > >> > >> r...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > >> real mem = 6425083904 (6127MB) > >> avail mem = 6215311360 (5927MB) > >> random: good seed from bootblocks > >> mpath0 at root > >> scsibus0 at mpath0: 256 targets > >> mainbus0 at root > >> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xbfbb401f (98 entries) > >> bios0: vendor VMware, Inc. version "VMW71.00V.0.B64.1506250318" date > >> 06/25/2015 > >> bios0: VMware, Inc. VMware7,1 > >> acpi0 at bios0: ACPI 4.0 > >> acpi0: sleep states S0 S1 S4 S5 > >> acpi0: tables DSDT SRAT FACP APIC MCFG HPET WAET > >> acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) > >> S3F0(S3) S4F0(S3) S5F0(S3) S6F0(S3) S7F0(S3) S8F0(S3) S9F0(S3) S10F(S3) > >> S11F(S3) S12F(S3) S13F(S3) [...] > >> acpitimer0 at acpi0: 3579545 Hz, 24 bits > >> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat > >> cpu0 at mainbus0: apid 0 (boot processor) > >> cpu0: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2397.42 MHz, 06-3f-02 > >> cpu0: > >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,HV,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,SENSOR,ARAT,MELTDOWN > >> cpu0: 256KB 64b/line 8-way L2 cache > >> cpu0: smt 0, core 0, package 0 > >> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > >> cpu0: apic clock running at 65MHz > >> cpu1 at mainbus0: apid 1 (application processor) > >> cpu1: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2397.29 MHz, 06-3f-02 > >> cpu1: > >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,HV,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,SENSOR,ARAT,MELTDOWN > >> cpu1: 256KB 64b/line 8-way L2 cache > >> cpu1: disabling user TSC (skew=-5310) > >> cpu1: smt 0, core 1, package 0 > >> cpu2 at mainbus0: apid 2 (application processor) > >> cpu2: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2397.28 MHz, 06-3f-02 > >> cpu2: > >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,HV,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,SENSOR,ARAT,MELTDOWN > >> cpu2: 256KB 64b/line 8-way L2 cache > >> cpu2: disabling user TSC (skew=-5335) > >> cpu2: smt 0, core 2, package 0 > >> cpu3 at mainbus0: apid 3 (application processor) > >> cpu3: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 2397.29 MHz, 06-3f-02 > >> cpu3: > >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,HV,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,SENSOR,ARAT,MELTDOWN > >> cpu3: 256KB 64b/line 8-way L2 cache > >> cpu3: disabling user TSC (skew=-7386) > >> cpu3: smt 0, core 3, package 0 > >> ioapic0 at mainbus0: apid 4 pa 0xfec00000, version 11, 24 pins, remapped > >> acpimcfg0 at acpi0 > >> acpimcfg0: addr 0xe0000000, bus 0-127 > >> acpihpet0 at acpi0: 14318179 Hz > >> acpiprt0 at acpi0: bus 0 (PCI0) > >> acpipci0 at acpi0 PCI0: 0x00000000 0x00000011 0x00000001 > >> acpicmos0 at acpi0 > >> "PNP0A05" at acpi0 not configured > >> acpiac0 at acpi0: AC unit online > >> acpicpu0 at acpi0: C1(@1 halt!) > >> acpicpu1 at acpi0: C1(@1 halt!) > >> acpicpu2 at acpi0: C1(@1 halt!) > >> acpicpu3 at acpi0: C1(@1 halt!) > >> cpu0: using Broadwell MDS workaround > >> pvbus0 at mainbus0: VMware > >> vmt0 at pvbus0 > >> pci0 at mainbus0 bus 0 > >> 0:16:0: rom address conflict 0xffffc000/0x4000 > >> pchb0 at pci0 dev 0 function 0 "Intel 82443BX AGP" rev 0x01 > >> ppb0 at pci0 dev 1 function 0 "Intel 82443BX AGP" rev 0x01 > >> pci1 at ppb0 bus 1 > >> pcib0 at pci0 dev 7 function 0 "Intel 82371AB PIIX4 ISA" rev 0x08 > >> pciide0 at pci0 dev 7 function 1 "Intel 82371AB IDE" rev 0x01: DMA, > >> channel 0 configured to compatibility, channel 1 configured to > >> compatibility > >> pciide0: channel 0 disabled (no drives) > >> atapiscsi0 at pciide0 channel 1 drive 0 > >> scsibus1 at atapiscsi0: 2 targets > >> cd0 at scsibus1 targ 0 lun 0: <NECVMWar, VMware IDE CDR10, 1.00> removable > >> cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 > >> piixpm0 at pci0 dev 7 function 3 "Intel 82371AB Power" rev 0x08: SMBus > >> disabled > >> "VMware VMCI" rev 0x10 at pci0 dev 7 function 7 not configured > >> "VMware SVGA II" rev 0x00 at pci0 dev 15 function 0 not configured > >> mpi0 at pci0 dev 16 function 0 "Symbios Logic 53c1030" rev 0x01: apic 4 > >> int 17 > >> mpi0: 0, firmware 1.3.41.32 > >> scsibus2 at mpi0: 16 targets, initiator 7 > >> sd0 at scsibus2 targ 0 lun 0: <VMware, Virtual disk, 1.0> > >> sd0: 16384MB, 512 bytes/sector, 33554432 sectors > >> sd1 at scsibus2 targ 1 lun 0: <VMware, Virtual disk, 1.0> > >> sd1: 81920MB, 512 bytes/sector, 167772160 sectors > >> mpi0: target 0 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 > >> mpi0: target 1 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1 > >> ppb1 at pci0 dev 17 function 0 "VMware PCI" rev 0x02 > >> pci2 at ppb1 bus 2 > >> ppb2 at pci0 dev 21 function 0 "VMware PCIE" rev 0x01: msi > >> pci3 at ppb2 bus 3 > >> vmx0 at pci3 dev 0 function 0 "VMware VMXNET3" rev 0x01: msix, 4 queues, > >> address 00:0c:29:7d:49:9c > >> ppb3 at pci0 dev 21 function 1 "VMware PCIE" rev 0x01: msi > >> pci4 at ppb3 bus 4 > >> ppb4 at pci0 dev 21 function 2 "VMware PCIE" rev 0x01: msi > >> pci5 at ppb4 bus 5 > >> ppb5 at pci0 dev 21 function 3 "VMware PCIE" rev 0x01: msi > >> pci6 at ppb5 bus 6 > >> ppb6 at pci0 dev 21 function 4 "VMware PCIE" rev 0x01: msi > >> pci7 at ppb6 bus 7 > >> ppb7 at pci0 dev 21 function 5 "VMware PCIE" rev 0x01: msi > >> pci8 at ppb7 bus 8 > >> ppb8 at pci0 dev 21 function 6 "VMware PCIE" rev 0x01: msi > >> pci9 at ppb8 bus 9 > >> ppb9 at pci0 dev 21 function 7 "VMware PCIE" rev 0x01: msi > >> pci10 at ppb9 bus 10 > >> ppb10 at pci0 dev 22 function 0 "VMware PCIE" rev 0x01: msi > >> pci11 at ppb10 bus 11 > >> ppb11 at pci0 dev 22 function 1 "VMware PCIE" rev 0x01: msi > >> pci12 at ppb11 bus 12 > >> ppb12 at pci0 dev 22 function 2 "VMware PCIE" rev 0x01: msi > >> pci13 at ppb12 bus 13 > >> ppb13 at pci0 dev 22 function 3 "VMware PCIE" rev 0x01: msi > >> pci14 at ppb13 bus 14 > >> ppb14 at pci0 dev 22 function 4 "VMware PCIE" rev 0x01: msi > >> pci15 at ppb14 bus 15 > >> ppb15 at pci0 dev 22 function 5 "VMware PCIE" rev 0x01: msi > >> pci16 at ppb15 bus 16 > >> ppb16 at pci0 dev 22 function 6 "VMware PCIE" rev 0x01: msi > >> pci17 at ppb16 bus 17 > >> ppb17 at pci0 dev 22 function 7 "VMware PCIE" rev 0x01: msi > >> pci18 at ppb17 bus 18 > >> ppb18 at pci0 dev 23 function 0 "VMware PCIE" rev 0x01: msi > >> pci19 at ppb18 bus 19 > >> ppb19 at pci0 dev 23 function 1 "VMware PCIE" rev 0x01: msi > >> pci20 at ppb19 bus 20 > >> ppb20 at pci0 dev 23 function 2 "VMware PCIE" rev 0x01: msi > >> pci21 at ppb20 bus 21 > >> ppb21 at pci0 dev 23 function 3 "VMware PCIE" rev 0x01: msi > >> pci22 at ppb21 bus 22 > >> ppb22 at pci0 dev 23 function 4 "VMware PCIE" rev 0x01: msi > >> pci23 at ppb22 bus 23 > >> ppb23 at pci0 dev 23 function 5 "VMware PCIE" rev 0x01: msi > >> pci24 at ppb23 bus 24 > >> ppb24 at pci0 dev 23 function 6 "VMware PCIE" rev 0x01: msi > >> pci25 at ppb24 bus 25 > >> ppb25 at pci0 dev 23 function 7 "VMware PCIE" rev 0x01: msi > >> pci26 at ppb25 bus 26 > >> ppb26 at pci0 dev 24 function 0 "VMware PCIE" rev 0x01: msi > >> pci27 at ppb26 bus 27 > >> ppb27 at pci0 dev 24 function 1 "VMware PCIE" rev 0x01: msi > >> pci28 at ppb27 bus 28 > >> ppb28 at pci0 dev 24 function 2 "VMware PCIE" rev 0x01: msi > >> pci29 at ppb28 bus 29 > >> ppb29 at pci0 dev 24 function 3 "VMware PCIE" rev 0x01: msi > >> pci30 at ppb29 bus 30 > >> ppb30 at pci0 dev 24 function 4 "VMware PCIE" rev 0x01: msi > >> pci31 at ppb30 bus 31 > >> ppb31 at pci0 dev 24 function 5 "VMware PCIE" rev 0x01: msi > >> pci32 at ppb31 bus 32 > >> ppb32 at pci0 dev 24 function 6 "VMware PCIE" rev 0x01: msi > >> pci33 at ppb32 bus 33 > >> ppb33 at pci0 dev 24 function 7 "VMware PCIE" rev 0x01: msi > >> pci34 at ppb33 bus 34 > >> isa0 at pcib0 > >> isadma0 at isa0 > >> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 > >> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo > >> com0: console > >> com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo > >> pckbc0 at isa0 port 0x60/5 irq 1 irq 12 > >> pckbd0 at pckbc0 (kbd slot) > >> wskbd0 at pckbd0 mux 1 > >> pms0 at pckbc0 (aux slot) > >> wsmouse0 at pms0 mux 0 > >> pcppi0 at isa0 port 0x61 > >> spkr0 at pcppi0 > >> efifb0 at mainbus0: 1152x864, 32bpp > >> wsdisplay0 at efifb0 mux 1 > >> wskbd0: connecting to wsdisplay0 > >> wsdisplay0: screen 0-5 added (std, vt100 emulation) > >> vscsi0 at root > >> scsibus3 at vscsi0: 256 targets > >> softraid0 at root > >> scsibus4 at softraid0: 256 targets > >> root on sd0a (a5bd0d220920df21.a) swap on sd0b dump on sd0b > >> ---- > >> > >> This happens on 2 ESXi hosts at least > >> > >> - 1 > >> - ESXi 6.0.0, 3029758 > >> - Xeon E5-2620 x 2 > >> - 2 > >> - ESXi 6.7.0 Update 3 (Build 14320388) > >> - Xeon Silver 4208 x 2 > >> > > > > This broke the monotonic clock on my Ryzen 5 2500U. > > > > Note: userland TSC has never worked on this device. > > > > $ ./monotime > > 320678 Starting > > 351995 Starting > > 387215 Starting > > 505501 Starting > > 387215 Stopped > > 505501 Back 213.019672364 => 212.599096325 > > 505501 Stopped > > 351995 Stopped > > 320678 Stopped > > Which diff did you apply? Yasuoka provided two diffs. > > In any case, ignore this diff: > > > diff --git a/sys/arch/amd64/amd64/tsc.c b/sys/arch/amd64/amd64/tsc.c > > index 238a5a068e1..3b951a8b5a3 100644 > > --- a/sys/arch/amd64/amd64/tsc.c > > +++ b/sys/arch/amd64/amd64/tsc.c > > @@ -212,7 +212,8 @@ cpu_recalibrate_tsc(struct timecounter *tc) > > u_int > > tsc_get_timecount(struct timecounter *tc) > > { > > - return rdtsc_lfence() + curcpu()->ci_tsc_skew; > > + //return rdtsc_lfence() + curcpu()->ci_tsc_skew; > > + return rdtsc_lfence(); > > } > > > > void > > > We don't want to discard the skews, that's wrong. > > The reason it "fixes" Yasuoka's problem is because the real skews > on the ESXi VMs in question are probably close to zero but our > synchronization algorithm is picking huge (wrong) skews due to > some other variable interfering with our measurement.
Right. If a VM exit happens while we're doing our measurement, you'll see a significant delay. And a guest OS can't prevent those from happening. But even on real hardware SMM mode may interfere with our measurement.