Hi, On Mon, 5 Apr 2021 10:43:00 +0300 Paul Irofti <p...@irofti.net> wrote: > On 05.04.2021 06:13, Scott Cheloha wrote: >> On Mon, Mar 29, 2021 at 02:00:01PM +0900, YASUOKA Masahiko wrote: >>> On Thu, 25 Mar 2021 19:41:35 +0100 (CET) >>> Mark Kettenis <mark.kette...@xs4all.nl> wrote: >>>>> From: Scott Cheloha <scottchel...@gmail.com> >>>>> Date: Thu, 25 Mar 2021 13:18:04 -0500 >>>>>> On Wed, Mar 24, 2021 at 05:40:21PM +0900, YASUOKA Masahiko wrote: >>>>> Which diff did you apply? Yasuoka provided two diffs. >>>>> >>>>> In any case, ignore this diff: >>>>> >>>>>> diff --git a/sys/arch/amd64/amd64/tsc.c b/sys/arch/amd64/amd64/tsc.c >>>>>> index 238a5a068e1..3b951a8b5a3 100644 >>>>>> --- a/sys/arch/amd64/amd64/tsc.c >>>>>> +++ b/sys/arch/amd64/amd64/tsc.c >>>>>> @@ -212,7 +212,8 @@ cpu_recalibrate_tsc(struct timecounter *tc) >>>>>> u_int >>>>>> tsc_get_timecount(struct timecounter *tc) >>>>>> { >>>>>> - return rdtsc_lfence() + curcpu()->ci_tsc_skew; >>>>>> + //return rdtsc_lfence() + curcpu()->ci_tsc_skew; >>>>>> + return rdtsc_lfence(); >>>>>> } >>>>>> >>>>>> void >>>>> >>>>> >>>>> We don't want to discard the skews, that's wrong. >>> >>> I'm sorry for the confusion. >> No problem. >> >>>>> The reason it "fixes" Yasuoka's problem is because the real skews >>>>> on the ESXi VMs in question are probably close to zero but our >>>>> synchronization algorithm is picking huge (wrong) skews due to >>>>> some other variable interfering with our measurement. >>>> >>>> Right. If a VM exit happens while we're doing our measurement, you'll >>>> see a significant delay. And a guest OS can't prevent those from >>>> happening. But even on real hardware SMM mode may interfere with our >>>> measurement. >>> >>> For machines like the ESXi VMs, the measurement seems to have to >>> exclude such delayed values as outliers. I think taking a lot of >>> samples and choice the minimum is a good enough way for the purpose. >>> >>> I updated the diff. >>> >>> - delete lines for debug >>> - make tsc quality lower if skew is not good enough >>> - reduce difference from NetBSD >>> >>> comment? ok? >> If more iterations fixes your problem, great. It isn't going to make >> things worse for machines with sync'd TSCs, makes the TSC usable on >> another class of machine, and is relatively cheap. >> This is ok cheloha@. >> You need another ok, though. > > > The diff is obviously fine. But it is still a heuristic with no real > motivation except for this particular ESXi VM case. So my question > about why we choose the minimum instead of the median or the mean has > not been answered.
Because median or mean is affected by outliers. We actually see some outliers in samples from the VMware. I suppose there is a better mesure, but I am currently no idia and had not used that kind of measure in kernel. On the other hand, finding the minimum is very simple. > Another issue that I see is that people have not reported, at least > publicly, that this runs fine on their normal OpenBSD machines. Some dmesgs posted on public lists seems to have the same problem. https://marc.info/?l=openbsd-bugs&w=2&r=1&s=disabling+user+TSC&q=b https://marc.info/?l=openbsd-tech&w=2&r=1&s=disabling+user+TSC&q=b https://marc.info/?l=openbsd-ports&w=2&r=1&s=disabling+user+TSC&q=b For example, https://marc.info/?l=openbsd-bugs&m=161618496905444&w=2 |Subject: wg(4) crash |From: Stuart Henderson <stu () spacehopper ! org> |bios0: vendor Dell Inc. version "2.9.0" date 12/06/2019 |bios0: Dell Inc. PowerEdge R620 |cpu1: disabling user TSC (skew=135) |cpu1: smt 0, core 0, package 1 https://marc.info/?l=openbsd-ports&m=161306073708427&w=2 |Subject: Re: sysutils/nut README APC over USB device chgrp/chmod |From: Marcus MERIGHI <mcmer-openbsd () tor ! at> |bios0: vendor American Megatrends Inc. version "3.1" date 06/07/2018 |cpu11: disabling user TSC (skew=240) |cpu11: smt 0, core 3, package 1 these 2 are real machine and using 2 CPU sockets. https://marc.info/?l=openbsd-ports&m=161562278114172&w=2 |Subject: ruby27 vs Puppet |From: Giovanni Bechis <giovanni () paclan ! it> |bios0: vendor Phoenix Technologies LTD version "6.00" date 12/12/2018 |bios0: VMware, Inc. VMware Virtual Platform |cpu1: disabling user TSC (skew=-12705) VMware. seems the same problem of mine. I'll ask people to do the same test which cheloha@ write in previous mail. > If you think you have enough tests and you want to go ahead with the > current heuristic, then the diff is obviously fine. OK pirofti@ > >> >>> Index: sys/arch/amd64//amd64/tsc.c >>> =================================================================== >>> RCS file: /disk/cvs/openbsd/src/sys/arch/amd64/amd64/tsc.c,v >>> retrieving revision 1.23 >>> diff -u -p -r1.23 tsc.c >>> --- sys/arch/amd64//amd64/tsc.c 23 Feb 2021 04:44:30 -0000 1.23 >>> +++ sys/arch/amd64//amd64/tsc.c 29 Mar 2021 04:18:31 -0000 >>> @@ -38,6 +38,7 @@ int tsc_is_invariant; >>> #define TSC_DRIFT_MAX 250 >>> #define TSC_SKEW_MAX 100 >>> +#define TSC_SYNC_ROUNDS 1000 >>> int64_t tsc_drift_observed; >>> volatile int64_t tsc_sync_val; >>> @@ -235,6 +236,7 @@ tsc_timecounter_init(struct cpu_info *ci >>> printf("%s: disabling user TSC (skew=%lld)\n", >>> ci->ci_dev->dv_xname, (long long)ci->ci_tsc_skew); >>> tsc_timecounter.tc_user = 0; >>> + tsc_timecounter.tc_quality = -1000; >>> } >>> if (!(ci->ci_flags & CPUF_PRIMARY) || >>> @@ -314,13 +316,19 @@ tsc_read_bp(struct cpu_info *ci, uint64_ >>> void >>> tsc_sync_bp(struct cpu_info *ci) >>> { >>> + int i, val, diff; >>> uint64_t bptsc, aptsc; >>> - tsc_read_bp(ci, &bptsc, &aptsc); /* discarded - cache effects */ >>> - tsc_read_bp(ci, &bptsc, &aptsc); >>> + val = INT_MAX; >>> + for (i = 0; i < TSC_SYNC_ROUNDS; i++) { >>> + tsc_read_bp(ci, &bptsc, &aptsc); >>> + diff = bptsc - aptsc; >>> + if (abs(diff) < abs(val)) >>> + val = diff; >>> + } >>> /* Compute final value to adjust for skew. */ >>> - ci->ci_tsc_skew = bptsc - aptsc; >>> + ci->ci_tsc_skew = val; >>> } >>> /* >>> @@ -351,8 +359,10 @@ tsc_post_ap(struct cpu_info *ci) >>> void >>> tsc_sync_ap(struct cpu_info *ci) >>> { >>> - tsc_post_ap(ci); >>> - tsc_post_ap(ci); >>> + int i; >>> + >>> + for (i = 0; i < TSC_SYNC_ROUNDS; i++) >>> + tsc_post_ap(ci); >>> } >>> void