On Mon, Mar 14, 2022 at 05:19:37PM +0100, Jan Beulich wrote:
> On 11.03.2022 14:45, Roger Pau Monné wrote:
> > On Mon, Feb 14, 2022 at 10:25:11AM +0100, Jan Beulich wrote:
> >> Use the original calibration against PIT only when the platform timer
> >> is PIT. This implicitly excludes the "xen_guest" case from using the PIT
> >> logic (init_pit() fails there, and as of 5e73b2594c54 ["x86/time: minor
> >> adjustments to init_pit()"] using_pit also isn't being set too early
> >> anymore), so the respective hack there can be dropped at the same time.
> >> This also reduces calibration time from 100ms to 50ms, albeit this step
> >> is being skipped as of 0731a56c7c72 ("x86/APIC: no need for timer
> >> calibration when using TDT") anyway.
> >>
> >> While re-indenting the PIT logic in calibrate_APIC_clock(), besides
> >> adjusting style also switch around the 2nd TSC/TMCCT read pair, to match
> >> the order of the 1st one, yielding more consistent deltas.
> >>
> >> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> >> ---
> >> Open-coding apic_read() in read_tmcct() isn't overly nice, but I wanted
> >> to avoid x2apic_enabled being evaluated twice in close succession. (The
> >> barrier is there just in case only anyway: While this RDMSR isn't
> >> serializing, I'm unaware of any statement whether it can also be
> >> executed speculatively, like RDTSC can.) An option might be to move the
> >> function to apic.c such that it would also be used by
> >> calibrate_APIC_clock().
> > 
> > I think that would make sense. Or else it's kind of orthogonal that we
> > use a barrier in calibrate_apic_timer but not in calibrate_APIC_clock.
> 
> But there is a barrier there, via rdtsc_ordered(). Thinking about
> this again, I'm not not even sure I'd like to use the helper in
> calibrate_APIC_clock(), as there's no need to have two barriers
> there.
> 
> But I guess I'll move the function in any event, so it at least
> feels less like a layering violation. But I still would want to
> avoid calling apic_read(), i.e. the function would remain as is
> (albeit perhaps renamed as becoming non-static).
> 
> > But maybe we can get rid of the open-coded PIT calibration in
> > calibrate_APIC_clock? (see below)
> > 
> >> --- a/xen/arch/x86/time.c
> >> +++ b/xen/arch/x86/time.c
> >> @@ -26,6 +26,7 @@
> >>  #include <xen/symbols.h>
> >>  #include <xen/keyhandler.h>
> >>  #include <xen/guest_access.h>
> >> +#include <asm/apic.h>
> >>  #include <asm/io.h>
> >>  #include <asm/iocap.h>
> >>  #include <asm/msr.h>
> >> @@ -1004,6 +1005,78 @@ static u64 __init init_platform_timer(vo
> >>      return rc;
> >>  }
> >>  
> >> +static uint32_t __init read_tmcct(void)
> >> +{
> >> +    if ( x2apic_enabled )
> >> +    {
> >> +        alternative("lfence", "mfence", X86_FEATURE_MFENCE_RDTSC);
> >> +        return apic_rdmsr(APIC_TMCCT);
> >> +    }
> >> +
> >> +    return apic_mem_read(APIC_TMCCT);
> >> +}
> >> +
> >> +static uint64_t __init read_pt_and_tmcct(uint32_t *tmcct)
> >> +{
> >> +    uint32_t tmcct_prev = *tmcct = read_tmcct(), tmcct_min = ~0;
> >> +    uint64_t best = best;
> >> +    unsigned int i;
> >> +
> >> +    for ( i = 0; ; ++i )
> >> +    {
> >> +        uint64_t pt = plt_src.read_counter();
> >> +        uint32_t tmcct_cur = read_tmcct();
> >> +        uint32_t tmcct_delta = tmcct_prev - tmcct_cur;
> >> +
> >> +        if ( tmcct_delta < tmcct_min )
> >> +        {
> >> +            tmcct_min = tmcct_delta;
> >> +            *tmcct = tmcct_cur;
> >> +            best = pt;
> >> +        }
> >> +        else if ( i > 2 )
> >> +            break;
> >> +
> >> +        tmcct_prev = tmcct_cur;
> >> +    }
> >> +
> >> +    return best;
> >> +}
> >> +
> >> +uint64_t __init calibrate_apic_timer(void)
> >> +{
> >> +    uint32_t start, end;
> >> +    uint64_t count = read_pt_and_tmcct(&start), elapsed;
> >> +    uint64_t target = CALIBRATE_VALUE(plt_src.frequency), actual;
> >> +    uint64_t mask = (uint64_t)~0 >> (64 - plt_src.counter_bits);
> >> +
> >> +    /*
> >> +     * PIT cannot be used here as it requires the timer interrupt to 
> >> maintain
> >> +     * its 32-bit software counter, yet here we run with IRQs disabled.
> >> +     */
> > 
> > The reasoning in calibrate_APIC_clock to have interrupts disabled
> > doesn't apply anymore I would think (interrupts are already enabled
> > when we get there),
> 
> setup_boot_APIC_clock() disables IRQs before calling
> calibrate_APIC_clock(). Whether the reasoning still applies is hard
> to tell - I at least cannot claim I fully understand the concern.

Me neither, I'm not sure what will explicitly need the first
interrupt, and why further interrupts won't be fine.

Also interrupts are already enabled before calling
calibrate_APIC_clock() (as it's setup_boot_APIC_clock() that disables
them), so this whole thing about getting the first interrupt seems
very bogus and plain wrong.

> > and hence it seems to me that calibrate_APIC_clock
> > could be called with interrupts enabled and we could remove the
> > open-coded usage of the PIT in calibrate_APIC_clock.
> 
> I won't exclude this might be possible, but it would mean changing
> a path which is hardly ever used nowadays. While on one hand this
> means hardly anyone might notice, otoh it also means possible
> breakage might not be noticed until far in the future. It anyway
> feels too much for a single change to also alter calibration against
> PIT right here.

You are already changing this path by using a clocksource different
than PIT to perform the calibration.

> One thing seems quite clear though: Doing any of this with interrupts
> enabled increases the chances for the read pairs to not properly
> correlate, due to an interrupt happening in the middle. This alone is
> a reason for me to want to keep IRQs off here.

Right, TSC calibration is also done with interrupts disabled, so it
does seem correct to do the same here for APIC.

Maybe it would be cleaner to hide the specific PIT logic in
calibrate_apic_timer() so that we could remove get_8254_timer_count()
and wait_8254_wraparound() from apic.c and apic.c doesn't have any PIT
specific code anymore?

I think using channel 2 like it's used for the TSC calibration won't
be possible at this point, since it will skew read_pit_count() users?
In any case if we disable interrupts those will already be skewed
because the timer won't be rearmed until interrupts are enabled.

Thanks, Roger.

Reply via email to