Glauber de Oliveira Costa wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Folks,
>
> Here's an updated version of the kvm paravirt clock infrastructure. Some
> of the issues you all raised were addressed, some were not. This work
> basically focus on general issues / clocksource, so don't expect to see
> much new things here in the clock events area.
>
>
Me like.
Please separate the guest and host parts; this allows backporting the
guest part to older kernels.
> +
> +/* This is our clockevents structure. We only support one shot operation */
> +static struct clock_event_device kvm_clockevent = {
> + .name = "kvm-timer",
> + .features = CLOCK_EVT_FEAT_ONESHOT,
> + .shift = 0,
> + .mult = 1,
> + .max_delta_ns = 0xffffffff,
> + .min_delta_ns = 1000000,
>
Why this particular number? Doesn't it depend on the host's clocksource?
> @@ -1564,6 +1566,36 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
> }
> EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>
> +static struct kvm_hv_clock hv_clock;
> +
> +static void kvm_write_guest_time(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct timespec ts;
> + void *clock_addr;
> +
> + if ((!kvm->clock_page) || !(kvm->clock_gpa))
> + return;
>
Why two checks?
> +
> + clock_addr = kmap(kvm->clock_page);
>
kmap_atomic() is much faster and more akpm-resistant.
> +
> + /* Updates version to the next odd number, indicating we're writing */
> + hv_clock.version++;
> + /* Updating the tsc count is the first thing we do */
> + kvm_get_msr(vcpu, MSR_IA32_TIME_STAMP_COUNTER, &hv_clock.last_tsc);
> + ktime_get_ts(&ts);
> + hv_clock.now_ns = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec;
> + hv_clock.wc_sec = get_seconds();
> + hv_clock.version++;
> + WARN_ON(hv_clock.version & 1);
>
smp_wmb()s are missing here.
The guest can trigger the WARN_ON() by being naughty with .version.
> +
> + memcpy(clock_addr, &hv_clock, sizeof(hv_clock));
> + mark_page_dirty(kvm, kvm->clock_gpa >> PAGE_SHIFT);
> + kunmap(kvm->clock_page);
> +
> + kvm->time_needs_update = 0;
> +}
> +
> int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> {
> unsigned long nr, a0, a1, a2, a3, ret;
> @@ -1584,7 +1616,37 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> a3 &= 0xFFFFFFFF;
> }
>
> + ret = 0;
> switch (nr) {
> + case KVM_HCALL_REGISTER_CLOCK: {
> + if (!irqchip_in_kernel(vcpu->kvm)) {
> + ret = -1;
>
Return a proper KVM error.
> + break;
> + }
> +
> + vcpu->kvm->clock_gpa = a0;
>
i386 gpas are 64-bit. Better define it as a gfn (that absolves you of
checking alignment too).
> + vcpu->kvm->clock_page = gfn_to_page(vcpu->kvm, a0 >>
> PAGE_SHIFT);
> +
>
You're touching shared state here. Can probably cause problems if the
guest turns nasty.
> static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
> diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
> index c6f3fd8..8dec29f 100644
> --- a/include/asm-x86/kvm_para.h
> +++ b/include/asm-x86/kvm_para.h
> @@ -10,9 +10,12 @@
> * paravirtualization, the appropriate feature bit should be checked.
> */
> #define KVM_CPUID_FEATURES 0x40000001
> +#define KVM_FEATURE_CLOCKEVENTS 0
> +#define KVM_FEATURE_CLOCKSOURCE 1
>
> #ifdef __KERNEL__
> #include <asm/processor.h>
> +extern void kvmclock_init(void);
>
> /* This instruction is vmcall. On non-VT architectures, it will generate a
> * trap that we will then rewrite to the appropriate instruction.
> @@ -29,6 +32,26 @@
> * noted by the particular hypercall.
> */
>
> +#define KVM_HCALL_REGISTER_CLOCK 0
>
Let's start hypercall numbers at 1 to trap some cases on uninitialized
data coming that way.
> long ret;
> diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
> index e4db25f..4c2afd0 100644
> --- a/include/linux/kvm_para.h
> +++ b/include/linux/kvm_para.h
> @@ -9,6 +9,10 @@
> * - kvm_para_available
> */
>
> +#define KVM_CPUID_FEATURES 0x40000001
> +#define KVM_FEATURE_CLOCKEVENTS 0
> +#define KVM_FEATURE_CLOCKSOURCE 1
> +
> /* Return values for hypercalls */
> #define KVM_ENOSYS 1000
>
Need to expose the capability to userspace as well via
KVM_CHECK_EXTENSION so userspace will know to expose it to the guest.
Finally, how about making this per-vcpu? That solves problems with tscs
running differently on different cpus, you don't need the wmb()s on the
host side, is faster for the guest due to less cacheline bouncing, I
could go on forever.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel