Re: monotonic time going back by wrong skews

Paul Irofti Tue, 06 Apr 2021 05:50:24 -0700

The diff is obviously fine. But it is still a heuristic with no real
motivation except for this particular ESXi VM case. So my question
about why we choose the minimum instead of the median or the mean has
not been answered.


Because median or mean is affected by outliers.  We actually see
some outliers in samples from the VMware.

I suppose there is a better mesure, but I am currently no idia and had
not used that kind of measure in kernel.  On the other hand, finding
the minimum is very simple.


Using the median should take care of the outliers though.

I'm not at all convinced that taking the absolute value of the
difference makes sense.  It probably works in this case since the
actual skew on your VM is zero.  So measurements close to zero are
"good".  But what if the skew isn't zero?  Take for example an AP that
is running ahead of the BP by 5000 ticks.  In that case, the right
value for the skew is -5000.  But now imagine that the BP gets
"interrupted" while doing a measurement, resulting in a delay of 10000
ticks between the two rdtsc_lfence() calls.  That would result in a
measured skew of around zero.  And by taking the minimum of the
absolute value, you end up using that value.


Exactly!

Re: monotonic time going back by wrong skews

Reply via email to