2017-10-05 18:54-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng...@hotmail.com>
> 
> The description in the Intel SDM of how the divide configuration
> register is used: "The APIC timer frequency will be the processor's bus
> clock or core crystal clock frequency divided by the value specified in
> the divide configuration register."
> 
> Observation of baremetal shown that when the TDCR is change, the TMCCT
> does not change or make a big jump in value, but the rate at which it
> count down change.
> 
> The patch update the emulation to APIC timer to so that a change to the
> divide configuration would be reflected in the value of the counter and
> when the next interrupt is triggered.
> 
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Radim Krčmář <rkrc...@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng...@hotmail.com>
> ---
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> @@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic)
>               HRTIMER_MODE_ABS_PINNED);
>  }
>  
> +static bool update_target_expiration(struct kvm_lapic *apic, uint32_t 
> old_divisor)
> +{
> +     ktime_t now, remaining;
> +     u64 tscl = rdtsc(), delta;
> +
> +     now = ktime_get();
> +     remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
> +     if (ktime_to_ns(remaining) < 0)
> +             remaining = 0;
> +     delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);

Hm, can this happen?

> +     if (!delta)
> +             return false;
> +
> +     apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
> +             * APIC_BUS_CYCLE_NS * apic->divide_count;

I think that it would be safer to always modify the period.

> +     delta = delta * apic->divide_count / old_divisor;
> +
> +     if (!apic->lapic_timer.period)
> +             return false;
> +
> +     limit_periodic_timer_frequency(apic);
> +
> +     apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
> +             nsec_to_cycles(apic->vcpu, delta);

We could do that without rdtsc() for added precision and maybe
performance:

        apic->lapic_timer.tscdeadline += nsec_to_cycles(apic->vcpu, delta) -
                                         nsec_to_cycles(apic->vcpu, remaining);

        // not sure how a negative operand would behave:
        // nsec_to_cycles(apic->vcpu, delta - remaining)

> +     apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
> +
> +     return true;
> +}
> +
>  static bool set_target_expiration(struct kvm_lapic *apic)
>  {
>       ktime_t now;
> @@ -1750,13 +1780,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 
> reg, u32 val)
>               start_apic_timer(apic);
>               break;
>  
> -     case APIC_TDCR:
> +     case APIC_TDCR: {
> +             uint32_t old_divisor = apic->divide_count;
> +
>               if (val & 4)
>                       apic_debug("KVM_WRITE:TDCR %x\n", val);
>               kvm_lapic_set_reg(apic, APIC_TDCR, val);
>               update_divide_count(apic);
> +             if (apic->divide_count != old_divisor) {
> +                     hrtimer_cancel(&apic->lapic_timer.timer);
> +                     if (update_target_expiration(apic, old_divisor))
> +                             restart_apic_timer(apic);

I think we can lose a timer here when we cancel a hrtimer whose
expiration time passes before update_target_expiration(), so it never
gets restarted.

Doing restart_apic_timer() unconditionally seems better.  It behaves
well if we try to restart a timer that has already fired.

Thanks.

> +             }
>               break;
> -
> +     }
>       case APIC_ESR:
>               if (apic_x2apic_mode(apic) && val != 0) {
>                       apic_debug("KVM_WRITE:ESR not zero %x\n", val);
> -- 
> 2.7.4
> 

Reply via email to