Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Gerd Hoffmann wrote:
> Hollis Blanchard wrote:
>   
>> That's a good point, but does PIT/HPET emulation show up as a hot spot
>> in any profiles? I think keeping the hypercall API as small as feasible
>> is a desirable design goal.
>> 
>
> pit probably doesn't due to being rarely updated.  For hpet I'd expect
> it showing up much more.  Asking for current time is a quite frequent
> operation in the linux kernel.  I've had X86_CR4_TSD set for a while and
> saw *lots* of rdtsc traps then ...
>
>   

Good point.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Dong, Eddie
[EMAIL PROTECTED] wrote:
> Hollis Blanchard wrote:
>> That's a good point, but does PIT/HPET emulation show up as a hot
>> spot in any profiles? I think keeping the hypercall API as small as
>> feasible is a desirable design goal. 
>> 
> 
> At the moment I think not.  Even a dyntick kernel would only require a
> few thousands exits per second worst case.
> 
Yes, the key issue is old PIT IRQ handler which is not very
virtualization
friendly.
Eddie

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Glauber de Oliveira Costa wrote:
> I think I'll stick to two fields then, with u64 for seconds and nanoseconds.
> Or would it just be better to use a u64 nanoseconds field?
>
> A _lot_ of time should have passed before such counter overflows. So
> having two is possibly overkill

u64 nanoseconds looks good to me.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Hollis Blanchard wrote:
> That's a good point, but does PIT/HPET emulation show up as a hot spot
> in any profiles? I think keeping the hypercall API as small as feasible
> is a desirable design goal.
>   

At the moment I think not.  Even a dyntick kernel would only require a 
few thousands exits per second worst case.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Dong, Eddie

+   case KVM_HCALL_SET_ALARM: {
+   ktime_t kt;
+   kt = ktime_add_ns(ktime_get_real(), a0);

This one will cause accumulated time shift. a0 is the delta time
when the guest calculate.

+   hrtimer_start(&vcpu->oneshooter, kt, HRTIMER_MODE_ABS);
+   break;
+   }

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Gerd Hoffmann
Hollis Blanchard wrote:
> That's a good point, but does PIT/HPET emulation show up as a hot spot
> in any profiles? I think keeping the hypercall API as small as feasible
> is a desirable design goal.

pit probably doesn't due to being rarely updated.  For hpet I'd expect
it showing up much more.  Asking for current time is a quite frequent
operation in the linux kernel.  I've had X86_CR4_TSD set for a while and
saw *lots* of rdtsc traps then ...

cheers,
  Gerd

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Gerd Hoffmann
Glauber de Oliveira Costa wrote:
> Well, my previous understanding was that if the CPU marks the tsc as
> stable, it _won't_ stop even in C3, and it was done this way exactly
> to make sure there are a stable source for timing.

Other way around:  Kernel can mark the TSC unstable if it figures it
does not tick on a constant rate (or if the kernel knows for certain
hardware it will not ...).

C3 on Intel hardware seems to be a known unstable case:

zweiblum kraxel ~# grep -A2 -B2 C3
/export/git/linux-2.6/arch/x86_64/kernel/tsc.c
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
#ifdef CONFIG_ACPI
/* But TSC doesn't tick in C3 so don't use it there */
if (acpi_gbl_FADT.header.length > 0 &&
acpi_gbl_FADT.C3latency < 1000)
return 1;
#endif


> But it does not really matter. We have two options:
> * Not considering tsc stable at all if sleeps, and then using the tsc
> just for adjustments.

Using the tsc for adjustments should do fine.

Xen updates the guest paravirt time with both systime and corrosponding
tsc timestamp, and the guest then uses the tsc delta to get more precise
time without having to ask the hypervisor each time.

If we take that scheme and additionally take care to update paravirt
time before re-running the guest after it went sleep for a while (and
thus the host might have been in C3) we should be set, right?

> * Considering the tsc stable, but taking the time the cpu spend
> sleeping into account, and returning some sorf of "return tsc +
> total_time_spent_sleeping_in_so_deep_sleep_states_such_as_C3" . I
> specially liked the naming.

I suspect that scheme would quickly becomes quite complex.  For starters
consider that the TSC-stops-in-C3 behavior seems to apply to Intel
Hardware only ...

cheers,
  Gerd



-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Glauber de Oliveira Costa wrote:
> On 10/15/07, Avi Kivity <[EMAIL PROTECTED]> wrote:
>   
>> If we want to make the clocksource useful for Windows we need to go
>> through the apic as that will allow the TPR to mask the interrupt when
>> Windows isn't ready to receive it.  However I don't know whether it is
>> possible to integrate a paravirt clocksource into Windows.
>> 
>
> My politically-correct side told me to say that the solution that
> works also in Windows is the best one ;-)
>
>   

In this case, I'm not at all sure Windows can benefit from a paravirt
timesource, so it might be better to concentrate on Linux. I'll ask the
Windows people here.

>> For Linux, we might as well go further and provide a completely
>> paravirtualized irqchip instead of trying to integrate a paravirtualized
>> clocksource into a fullyvirtualized irqchip.
>>
>> 
>
> Yes, definitely. In your opinion, what's the size of the gap between
> the "we can", and the "we should" here?
>   

At present the APIC requires an exit for each level-triggered interrupt.
We could reduce the number of exits to zero given sufficient cleverness.
This would be beneficial to non-streaming I/O workloads.

So yes, I think it's worthwhile.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Glauber de Oliveira Costa wrote:
> On 10/15/07, Gerd Hoffmann <[EMAIL PROTECTED]> wrote:
>   
>> Host should know.  Well, I hope.  Dunno whenever one really can be sure
>> in all cases given all the different CPUs and tsc implementations.
>> 
>
> In the same way we can be sure about hardware for everything else:
> Well... not being _quite_ sure ;-)
>
>   

It's important to know.  But in case the host knows the tsc doesn't
satisfy the feature's requirements, the host simply disables the feature.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Glauber de Oliveira Costa wrote:

 Obviously people have figured out how to do dynticks on real x86
 hardware, so I don't accept this reason. :)

 
>>> Using more advanced timers like the HPET.
>>>
>>>   
>> I'd think there is no reason for virtual timer to be PIT like, which
>> is mostly due to historic reason.
>>
>> If we need to make it close to native timer device, HPET is much
>> better than PIT for virtual time to look like.
>>
>> 
>
> Besides of course, the fact that you'd still have to keep the PIT
> timer around, for this same historical reason: It lives in an
> emulation layer that can't make too much assumptions about what's
> running on it, and it may well use the PIT exclusively. And we end up
> with two code bases to maintain.
>
> Obviously, as everything else, it's a trade-off. If the only benefit
> of an HPET like timer is its programability, then the paravirt timer
> solves this problem with less effort. If there are others, it might
> well be worth it.
>
>   

Using the HPET means older kernels (and other operating systems)
immediately benefit.  It also means we don't have to design the
interface.  While it's usually harder to do full emulation, and you get
worse performance than paravirt, it has very real benefits.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-16 Thread Avi Kivity
Glauber de Oliveira Costa wrote:
>>>   
>> Here you're passing a virtual address across the guest/host interface,
>> which is something we've previously agreed is bad.
>> 
> And that's the reason for the "nothing". It was just the easier way,
> and I'm not aware of any discussion in which it was agreed bad. I'll
> dig into the archives, but it would be a lot easier if you give me
> some pointers ;-)
>   

On x86, translating a virtual address in the hypervisor is merely
annoying.  On architectures which don't have hardware-walked page tables
(merely software-managed tlbs), translation is impossible.  So when you
talk to the hypervisor, use physical addresses exclusively.

>   
>> Are you intending this shared page to contain other data in the future,
>> or is it just poorly named?
>> 
>
> Well, this is still a work in progress, so the accurate answer should
> be: Something in the middle, between "more data", and "poorly named".
>
> I think more data can be in, but right now, everything I intend to
> (maybe) put in is timer related, so a rename could go well anyway.
>
>   

Having granular interfaces (one cpuid bit, one shared data structure per
feature is the best way of insulating ourselves against ABI
"accidents".  If we find out later the interface is bad, we throw it
away and implement a new one, without affecting others.

>>> +struct kvm_hv_clock {
>>> +   int stable_tsc; /* use raw tsc for clock_read */
>>> +   unsigned long tsc_mult;
>>> +   struct timespec now;
>>> +   u64 last_tsc;
>>> +   /* That's the wall clock, not the water closet */
>>> +   struct timespec wc;
>>> +};
>>>   
>> This isn't even good for x86: you're using "long" in a binary interface!
>> You're also using timespec, and what sort of binary compatibility
>> guarantees would you like to make about that?
>>
>> 
>
> Binary guarantees goes in version two ;-)
> But c'mon, now that I'm starting to be able to make some merely
> decents puns in english, I put one very well placed there, and you
> come telling me about binary compatiblity?
>
> Now seriously, the long is just my bad, but I'm not aware of any
> complications regarding the timespec. You are welcome to enlighten me.
> (Meanwhile, I'll go look at it)
>   

We want this to be usable in non-Linux and cross-bitness.  So:
- all primitive data types are __u32, __u64 and friends (not u32 or u64)
- all structures and constants are prefixed with kvm_
- all fields are naturally aligned
- add padding on structures larger than 8 bytes to be a multiple of 8 bytes

Following these simple rules is the key to health and happiness.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/15/07, Hollis Blanchard <[EMAIL PROTECTED]> wrote:
> (This mail was sent off-list, so I'll do the same.)

Oh, I'm sorry Hollis, after so many mails, I've hit the reply key
instead of  the reply to all one ;-)

I'm moving it to the list again, where it belongs.

> On Mon, 2007-10-15 at 14:29 -0300, Glauber de Oliveira Costa wrote:
> > >
> > > I must have missed earlier discussion on this topic, so I'm left
> > > wondering... what's the point? What's wrong with PIT (et al) emulation?
> > >
> > Well, my reasons are:
> >
> > * Steal time (despite the fact I've not addressed in this patch. But
> > c'mon guys, one thing at a time! ;-))
>
> Yes, I agree this is important, but I'm not sure you need a new time
> source to do that.

And how would you do it ?

Only the host knows how much time you spent not running. So you either
have an alternate clocksource, or goes tell the device about it. Which
yeah, may be more general, but it is quite cumbersome, IMHO.

> > * Performance: Although I've not measured it (for hpet we'd have to
> > have an implementation , in first place), I think that handling timer
> > events in the host might be more efficient than userspace exits for
> > pit emulation.
>
> Host emulation doesn't require guest paravirtualization.
>
> However, Avi makes a good point about the number of MMIO exits, and I've
> replied on-list.

Yes, this is what I meant.

> > * This is not exactly one that hurts me directly, but I saw come
> > complaints frome some people in kvm-devel saying that they needed to
> > run ntp in all the guests, and I felt sad for them. ;-) Paravirt
> > timers make this task easier, IMHO, although maybe something can be
> > done in the emulation layer for it.
>
> It's not clear to me that PV time would by itself solve any NTP issues.

host provides the guest with the idea of timing.

if the time in the host changes, the guest will get this update for free.

> To clarify, I'm not saying PV time is a bad idea. I'm just trying to
> understand why it's a good idea...

If you try to understand why it's a good idea, and I try to understand
why it's a bad idea, we'll hopefully come to a good middle ground, and
everybody wins ;-)

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
> The basic issue is that on many architectures, the hardware MMU does
> *not* hold all mappings, and even if it does, performing MMU translation
> in software can be prohibitively complicated and slow. As a worst-case
> scenario, consider a software-controlled TLB which can hold a small
> finite number of simultaneous mappings. On the embedded PowerPC 440
> cores I'm working on, that number is 64. There is no hardware table
> walker.
>
> That means that if the hypervisor is passed a virtual address, it may
> not be present in the MMU at all. Ordinarily the hardware would  deliver
> a TLB miss interrupt to the kernel, and the kernel would insert the
> appropriate mapping (as calculated or found in software-only data
> structures). However, consider the locking nightmare if the hypervisor
> delivers an interrupt while "inside" a guest hypercall...

that makes a lot of sense, thank you for the explanation.

> Longs everywhere. :) And not just that, but you are cementing this
> particular layout of timespec into the guest ABI. There is nothing to
> prevent Linux from changing timespec in some future version...

Yeah, have figured it out.
I think I'll stick to two fields then, with u64 for seconds and nanoseconds.
Or would it just be better to use a u64 nanoseconds field?

A _lot_ of time should have passed before such counter overflows. So
having two is possibly overkill.

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Hollis Blanchard
On Mon, 2007-10-15 at 18:47 +0200, Avi Kivity wrote:
> Hollis Blanchard wrote:
> >>
> >> I'd think there is no reason for virtual timer to be PIT like, which
> >> is mostly due to historic reason.
> >>
> >> If we need to make it close to native timer device, HPET is much
> >> better than PIT for virtual time to look like.
> >> 
> >
> > Fine, so why do we need PV time when we can emulate the HPET?
> >   
> 
> A pv timebase can require zero exits while any emulated timebase will 
> require at least one.

That's a good point, but does PIT/HPET emulation show up as a hot spot
in any profiles? I think keeping the hypercall API as small as feasible
is a desirable design goal.

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Hollis Blanchard
On Mon, 2007-10-15 at 14:48 -0300, Glauber de Oliveira Costa wrote:
> On 10/12/07, Hollis Blanchard <[EMAIL PROTECTED]> wrote:
> > > +   if (kvm_hypercall1(KVM_HCALL_SET_SHARED_PAGE, shared_page))
> > > +   return;
> >
> > Here you're passing a virtual address across the guest/host interface,
> > which is something we've previously agreed is bad.
> And that's the reason for the "nothing". It was just the easier way,
> and I'm not aware of any discussion in which it was agreed bad. I'll
> dig into the archives, but it would be a lot easier if you give me
> some pointers ;-)

Here's the original thread:
http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg00755.html

The basic issue is that on many architectures, the hardware MMU does
*not* hold all mappings, and even if it does, performing MMU translation
in software can be prohibitively complicated and slow. As a worst-case
scenario, consider a software-controlled TLB which can hold a small
finite number of simultaneous mappings. On the embedded PowerPC 440
cores I'm working on, that number is 64. There is no hardware table
walker.

That means that if the hypervisor is passed a virtual address, it may
not be present in the MMU at all. Ordinarily the hardware would  deliver
a TLB miss interrupt to the kernel, and the kernel would insert the
appropriate mapping (as calculated or found in software-only data
structures). However, consider the locking nightmare if the hypervisor
delivers an interrupt while "inside" a guest hypercall...

> for the overall picture, it does not really make a difference, so if
> this is all that a bad idea, I can change it, no big deal.

Great! This is very important for portability to non-x86 architectures,
and a long-standing problem on another open source hypervisor, which is
why I'm eager to avoid it from the beginning here.

> > > +struct kvm_hv_clock {
> > > +   int stable_tsc; /* use raw tsc for clock_read */
> > > +   unsigned long tsc_mult;
> > > +   struct timespec now;
> > > +   u64 last_tsc;
> > > +   /* That's the wall clock, not the water closet */
> > > +   struct timespec wc;
> > > +};
> >
> > This isn't even good for x86: you're using "long" in a binary interface!
> > You're also using timespec, and what sort of binary compatibility
> > guarantees would you like to make about that?
> 
> Now seriously, the long is just my bad, but I'm not aware of any
> complications regarding the timespec. You are welcome to enlighten me.
> (Meanwhile, I'll go look at it)

typedef long__kernel_time_t;
typedef __kernel_time_t time_t;
struct timespec {
time_t  tv_sec; /* seconds */
longtv_nsec;/* nanoseconds */
};

Longs everywhere. :) And not just that, but you are cementing this
particular layout of timespec into the guest ABI. There is nothing to
prevent Linux from changing timespec in some future version...

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/15/07, Avi Kivity <[EMAIL PROTECTED]> wrote:
> If we want to make the clocksource useful for Windows we need to go
> through the apic as that will allow the TPR to mask the interrupt when
> Windows isn't ready to receive it.  However I don't know whether it is
> possible to integrate a paravirt clocksource into Windows.

My politically-correct side told me to say that the solution that
works also in Windows is the best one ;-)

> For Linux, we might as well go further and provide a completely
> paravirtualized irqchip instead of trying to integrate a paravirtualized
> clocksource into a fullyvirtualized irqchip.
>

Yes, definitely. In your opinion, what's the size of the gap between
the "we can", and the "we should" here?

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/15/07, Carsten Otte <[EMAIL PROTECTED]> wrote:
> Gerd Hoffmann wrote:
> > With VT you can attempt to make that invisible to the guest using the
> > tsc offset field.  Probably svm can do that too (didn't check docs
> > though).  kvm-lite can't (what is the status btw?).  Xen "solves" that
> > by not doing power management *evil grin*.
> On s390, we've had an instruction for that ever since (store clock),
> that works with an offset field with hardware support. Works great for us.
I've heard s390 also have some special instructions to solve the
halting problem. ;-)

> If VT and SVM can do the same, I'd vote for using it if we can.

I was in the opposite side of gerd: I knew svm did it, not sure about vt ;-)

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/15/07, Gerd Hoffmann <[EMAIL PROTECTED]> wrote:
>
> Host should know.  Well, I hope.  Dunno whenever one really can be sure
> in all cases given all the different CPUs and tsc implementations.

In the same way we can be sure about hardware for everything else:
Well... not being _quite_ sure ;-)

> With VT you can attempt to make that invisible to the guest using the
> tsc offset field.  Probably svm can do that too (didn't check docs
> though).  kvm-lite can't (what is the status btw?).  Xen "solves" that
> by not doing power management *evil grin*.

Well, my previous understanding was that if the CPU marks the tsc as
stable, it _won't_ stop even in C3, and it was done this way exactly
to make sure there are a stable source for timing.

But it does not really matter. We have two options:
* Not considering tsc stable at all if sleeps, and then using the tsc
just for adjustments.
* Considering the tsc stable, but taking the time the cpu spend
sleeping into account, and returning some sorf of "return tsc +
total_time_spent_sleeping_in_so_deep_sleep_states_such_as_C3" . I
specially liked the naming.

> Nevertheless it is probably much easier to go with pv timers (or maybe
> emulate hpet timers).

At this time, I am for pv timers. (Why would I have written it in the
first place??). But indeed, a discussion about the emulation
alternative is healthy.

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/15/07, Avi Kivity <[EMAIL PROTECTED]> wrote:
> Gerd Hoffmann wrote:
> >
> >> 2) the TSC would have to be used as a clocksource.  You don't know the
> >> frequency which is the first problem with using the TSC but some systems
> >> have a TSC that changes frequencies.
> >>
> >
> > Also note the tsc may stop ticking if the CPU goes sleep in C3, which
> > IMHO makes the tsc almost useless as clocksource for guests ...
> >
>
> But the host knows that, right?  So it can update the guest's timebase?

Yes, that's the whole point.



-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/15/07, Dong, Eddie <[EMAIL PROTECTED]> wrote:
>
> >>> 1) the PIT is periodic.  a PV timer can offer a one shot
> >timer which
> >>> enables dynticks.
> >>>
> >>
> >> Obviously people have figured out how to do dynticks on real x86
> >> hardware, so I don't accept this reason. :)
> >>
> >
> >Using more advanced timers like the HPET.
> >
>
> I'd think there is no reason for virtual timer to be PIT like, which
> is mostly due to historic reason.
>
> If we need to make it close to native timer device, HPET is much
> better than PIT for virtual time to look like.
>

Besides of course, the fact that you'd still have to keep the PIT
timer around, for this same historical reason: It lives in an
emulation layer that can't make too much assumptions about what's
running on it, and it may well use the PIT exclusively. And we end up
with two code bases to maintain.

Obviously, as everything else, it's a trade-off. If the only benefit
of an HPET like timer is its programability, then the paravirt timer
solves this problem with less effort. If there are others, it might
well be worth it.

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/12/07, Hollis Blanchard <[EMAIL PROTECTED]> wrote:
>
> What makes you think this is page-aligned?

Nothing.

> > +   if (kvm_hypercall1(KVM_HCALL_SET_SHARED_PAGE, shared_page))
> > +   return;
>
> Here you're passing a virtual address across the guest/host interface,
> which is something we've previously agreed is bad.
And that's the reason for the "nothing". It was just the easier way,
and I'm not aware of any discussion in which it was agreed bad. I'll
dig into the archives, but it would be a lot easier if you give me
some pointers ;-)

for the overall picture, it does not really make a difference, so if
this is all that a bad idea, I can change it, no big deal.

> Are you intending this shared page to contain other data in the future,
> or is it just poorly named?

Well, this is still a work in progress, so the accurate answer should
be: Something in the middle, between "more data", and "poorly named".

I think more data can be in, but right now, everything I intend to
(maybe) put in is timer related, so a rename could go well anyway.

> > +struct kvm_hv_clock {
> > +   int stable_tsc; /* use raw tsc for clock_read */
> > +   unsigned long tsc_mult;
> > +   struct timespec now;
> > +   u64 last_tsc;
> > +   /* That's the wall clock, not the water closet */
> > +   struct timespec wc;
> > +};
>
> This isn't even good for x86: you're using "long" in a binary interface!
> You're also using timespec, and what sort of binary compatibility
> guarantees would you like to make about that?
>

Binary guarantees goes in version two ;-)
But c'mon, now that I'm starting to be able to make some merely
decents puns in english, I put one very well placed there, and you
come telling me about binary compatiblity?

Now seriously, the long is just my bad, but I'm not aware of any
complications regarding the timespec. You are welcome to enlighten me.
(Meanwhile, I'll go look at it)


-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/12/07, Hollis Blanchard <[EMAIL PROTECTED]> wrote:
> > 2) the TSC would have to be used as a clocksource.  You don't know the
> > frequency which is the first problem with using the TSC but some systems
> > have a TSC that changes frequencies.  A PV time source gives you more
> > stable clocksource (although as in glommer's patch, when the TSC can be
> > used, it's better to use it).
>
> As I understand it, the TSC is based on CPU frequency, which changes
> with power management. Architectural bug.
No. The TSC _used_ to be tied to CPU frequency. Current architectural
trends keep it constant, and provides other methods for cycle
counting. So in newer machines, the tsc should be constant.

> However, PV time still doesn't help here:
>   * The TSC is _user_ accessible, so PV time support in the guest
> kernel doesn't solve the problem.
I'm not sure about which (of them) problem you are talking about here.

>   * It looks like external agents can perform out-of-kernel
> frequency scaling on x86 (at least I see options for it on IBM
> blades). So there must already exist some mechanism for a kernel
> to be informed that the TSC frequency has been changed.

If the TSC frequency can be changed, I don't use tsc as a ultimate
resort of timing at all, but rather, just for micro-adjustments.

> > 3) a PV clock can support stolen time calculation which there really
> > isn't a concept of with emulation.
>
> This is true, and I know other platforms support this functionality. I
> think it's mostly useful for process time accounting. Is that actually
> supported in this patch?
>
> --
> Hollis Blanchard
> IBM Linux Technology Center
>
>


-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
On 10/12/07, Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:
> > * Measure the time it takes for a hypercall, and subtract this time
> > for calculating the expiry time for the timer event.
> >
>
> I don't think there's much point in trying to do stuff like this.  The
> guest can be preempted at any time, so there's an arbitrary amount of
> time between deciding to set a timeout, and the time the timeout
> actually happens.

Yes, we can't be precise. But my point here is just being _more_
accurate whenever possible. Yes, it might have been preempted, we
don't know how much time it spent out, but at least this time has
elapsed. It's a win in some scenarios, without too much overhead but
an initial measure and a summation.

Even with tsc adjustments to wallclock, and clock read, I'm using it
regardless of the host having accurate tsc. My main point behind it is
that it will, accurate or not, be _more_ accurate than the simple
version.

Of course, we can discuss if it will really make a difference.

> > + * If the platform provides a stable tsc, we just use it, and there is no 
> > need
> > + * for the host to update anything.
>
>
> How would you deal with suspend/resume/migrate?  Also, do you assume
> that stable_tsc also means synchronized tsc on an SMP host?

I have to think further on that.

But for all cases , I think that the host knows what happened, and can
adjust the timer accordingly. About sync tsc, this code shouldn't be
making this assumption, but right now... it is. I'll think further.

> So this returns a tsc here? >
[ .. snip .. ]
> ---> But returns ns here?
Yes. But if you look below, (unless I've really forgot to include in
the patch I fired out), I change the multiplier in case we're using a
tsc value. We multiply by one if using nanosecond timer, and by a host
provided multiplier if tsc timer.

>
> > + }
> > +
> > + do {
> > + last_tsc = hv_clock.last_tsc;
> > + rmb();
> > + now = &hv_clock.now;
> >
>
> Shouldn't this be taking a copy of now, rather than a pointer to it?
> Otherwise what's the point of this loop?

Yes, thanks.

> > + rmb();
> > + } while (hv_clock.last_tsc != last_tsc);
> >
>
> This won't be an atomic compare on 32-bit; it could get confused by
> seeing a half-updated tsc value.

You are right, it might be better using a version number like xen does, then.
humm... Rusty's approach of checking against seconds, and then using
nanoseconds only if it has not yet changed may work... but maybe not
as well, due to the tsc adjustments. Right now, I'm more inclined to
the first option.

> >
>
> Not worthwhile.  It would be better to make the hypercall take an
> absolute time, and pass it now+delta.  At least then if you get
> preempted past the timeout period you can return -ETIME, and the clock
> subsystem will know what to do.

Since the host and guest times are synchronized by assumption, it
might work. I'll give it a try.

> > +#ifdef CONFIG_KVM_CLOCK
> > + kvmclock_init();
> > +#endif
> >
>
> Why is this necessary?  Can't you hook one of the existing pvops?

Well, as you all have seen, I've just arrived from weekend, just took
an overlook over all the answers, but I think anthony already answered
this one: Someone has to override the pvops.

However, I think I can do better on this, doing what I did for vsmp on
x86_64, by using ARCH_INIT to check for availability. To be honest, I
considered both approaches, but saw no clear benefit in any, against
the other, and did this way just because vmi is already there.

But yeah, it seems to have some special requirements for placement,
that we do not. If you feel more confortable with it, I can move it to
ARCH_INIT.

>
> J
>


-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Avi Kivity
Glauber de Oliveira Costa wrote:
>>
>> It might be better to just rely on the in-kernel APIC to inject an
>> interrupt for the clock (via kvm_pic_set_irq()).
>>
>> 
>
>
> After trying this option a little bit, it does not seemed worthwhile
> to me. But it might well have been due to some misunderstanding from
> my part, and I can take a look again. Basically, we can't do it in the
> timer function because it is protected by a mutex. the hrtimer
> function runs in interrupt context, so we'd have to flag and signal
> anyway.
>
> "Interrupts for the clock" needs special irq lines anyway. (Well, VMI
> masks it and does a whole lot of work to use the apic, but I prefer
> the xen way, in which irq lines are triggered directly), so if I
> remember the code correctly, we'd have to put some code on for those
> lines anyway.
>
> I may be a little paranoid here, but as it does not seem we're going
> to be able to use all code structure as is, at least this way we avoid
> the apic code path in favour of a simpler one. But maybe it's really
> not worth it
>
> I'd see an direct advantage for that if we are going to use the local
> apic for the delivery, but then it complicates the kernel side, as it
> already has an apic clockevents registered, with a interrupt handler
> on its own.
>
>   

If we want to make the clocksource useful for Windows we need to go 
through the apic as that will allow the TPR to mask the interrupt when 
Windows isn't ready to receive it.  However I don't know whether it is 
possible to integrate a paravirt clocksource into Windows.

For Linux, we might as well go further and provide a completely 
paravirtualized irqchip instead of trying to integrate a paravirtualized 
clocksource into a fullyvirtualized irqchip.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Glauber de Oliveira Costa
> > +
> > +void __init kvmclock_init(void)
> > +{
> > +
> > + unsigned long shared_page = (unsigned long)&hv_clock;
> > + /*
> > +  * If we can't use the paravirt clock, just go with
> > +  * the usual timekeeping
> > +  */
> > + if (!kvm_para_available() || no_kvmclock)
> > + return;
> >
>
> You should also check kvm_para_has_feature() and define a feature flag
> for the clock.

Ok.

> > +static int kvm_get_pvclock_interrupt(struct kvm_vcpu *v)
> > +{
> > + int ret = v->timer_vector;
> > + v->timer_vector = -1;
> > + return  ret;
> > +}
> >  /*
> >   * Read pending interrupt vector and intack.
> >   */
> > @@ -51,7 +59,9 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
> >   struct kvm_pic *s;
> >   int vector;
> >
> > - vector = kvm_get_apic_interrupt(v); /* APIC */
> > + vector = kvm_get_pvclock_interrupt(v);
> > + if (vector == -1)
> > + vector = kvm_get_apic_interrupt(v); /* APIC */
> >
>
> It might be better to just rely on the in-kernel APIC to inject an
> interrupt for the clock (via kvm_pic_set_irq()).
>


After trying this option a little bit, it does not seemed worthwhile
to me. But it might well have been due to some misunderstanding from
my part, and I can take a look again. Basically, we can't do it in the
timer function because it is protected by a mutex. the hrtimer
function runs in interrupt context, so we'd have to flag and signal
anyway.

"Interrupts for the clock" needs special irq lines anyway. (Well, VMI
masks it and does a whole lot of work to use the apic, but I prefer
the xen way, in which irq lines are triggered directly), so if I
remember the code correctly, we'd have to put some code on for those
lines anyway.

I may be a little paranoid here, but as it does not seem we're going
to be able to use all code structure as is, at least this way we avoid
the apic code path in favour of a simpler one. But maybe it's really
not worth it

I'd see an direct advantage for that if we are going to use the local
apic for the delivery, but then it complicates the kernel side, as it
already has an apic clockevents registered, with a interrupt handler
on its own.

-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Avi Kivity
Hollis Blanchard wrote:
>>
>> I'd think there is no reason for virtual timer to be PIT like, which
>> is mostly due to historic reason.
>>
>> If we need to make it close to native timer device, HPET is much
>> better than PIT for virtual time to look like.
>> 
>
> Fine, so why do we need PV time when we can emulate the HPET?
>   

A pv timebase can require zero exits while any emulated timebase will 
require at least one.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Jeremy Fitzhardinge
Carsten Otte wrote:
> On s390, we've had an instruction...

Yes, quite ;)

J

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Hollis Blanchard
On Mon, 2007-10-15 at 12:04 +0800, Dong, Eddie wrote:
> >>> 1) the PIT is periodic.  a PV timer can offer a one shot 
> >timer which 
> >>> enables dynticks.
> >>> 
> >>
> >> Obviously people have figured out how to do dynticks on real x86
> >> hardware, so I don't accept this reason. :)
> >>   
> >
> >Using more advanced timers like the HPET.
> >
> 
> I'd think there is no reason for virtual timer to be PIT like, which
> is mostly due to historic reason.
> 
> If we need to make it close to native timer device, HPET is much
> better than PIT for virtual time to look like.

Fine, so why do we need PV time when we can emulate the HPET?

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Carsten Otte
Gerd Hoffmann wrote:
> With VT you can attempt to make that invisible to the guest using the
> tsc offset field.  Probably svm can do that too (didn't check docs
> though).  kvm-lite can't (what is the status btw?).  Xen "solves" that
> by not doing power management *evil grin*.
On s390, we've had an instruction for that ever since (store clock), 
that works with an offset field with hardware support. Works great for us.
If VT and SVM can do the same, I'd vote for using it if we can.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Gildas
2007/10/15, Gerd Hoffmann <[EMAIL PROTECTED]>:
> Avi Kivity wrote:
> > Gerd Hoffmann wrote:
> >>
> >>> 2) the TSC would have to be used as a clocksource.  You don't know the
> >>> frequency which is the first problem with using the TSC but some systems
> >>> have a TSC that changes frequencies.
> >>>
> >> Also note the tsc may stop ticking if the CPU goes sleep in C3, which
> >> IMHO makes the tsc almost useless as clocksource for guests ...
> >
> > But the host knows that, right?  So it can update the guest's timebase?
>
> Host should know.  Well, I hope.  Dunno whenever one really can be sure
> in all cases given all the different CPUs and tsc implementations.
>
> With VT you can attempt to make that invisible to the guest using the
> tsc offset field.  Probably svm can do that too (didn't check docs
> though).  kvm-lite can't (what is the status btw?).  Xen "solves" that
> by not doing power management *evil grin*.
>
> Nevertheless it is probably much easier to go with pv timers (or maybe
> emulate hpet timers).
>
> cheers,
>   Gerd

Hell I don't know what is the best technical way to solve this
problem, but as a sysadmin, I'm really "annoyed" when the time starts
to drift madly on the servers as soon as the host is loaded (talking
about esx servers there but I guess this will probably apply to any
existing x86 virtualisation solution).

A simple, reliable solution to get a stable time source is not a must
IMHO, it is a usability requirement.

Cheers,
Gildas

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Gerd Hoffmann
Avi Kivity wrote:
> Gerd Hoffmann wrote:
>>   
>>> 2) the TSC would have to be used as a clocksource.  You don't know the 
>>> frequency which is the first problem with using the TSC but some systems 
>>> have a TSC that changes frequencies.
>>> 
>> Also note the tsc may stop ticking if the CPU goes sleep in C3, which
>> IMHO makes the tsc almost useless as clocksource for guests ...
> 
> But the host knows that, right?  So it can update the guest's timebase?

Host should know.  Well, I hope.  Dunno whenever one really can be sure
in all cases given all the different CPUs and tsc implementations.

With VT you can attempt to make that invisible to the guest using the
tsc offset field.  Probably svm can do that too (didn't check docs
though).  kvm-lite can't (what is the status btw?).  Xen "solves" that
by not doing power management *evil grin*.

Nevertheless it is probably much easier to go with pv timers (or maybe
emulate hpet timers).

cheers,
  Gerd

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Avi Kivity
Gerd Hoffmann wrote:
>   
>> 2) the TSC would have to be used as a clocksource.  You don't know the 
>> frequency which is the first problem with using the TSC but some systems 
>> have a TSC that changes frequencies.
>> 
>
> Also note the tsc may stop ticking if the CPU goes sleep in C3, which
> IMHO makes the tsc almost useless as clocksource for guests ...
>   

But the host knows that, right?  So it can update the guest's timebase?


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-15 Thread Gerd Hoffmann
  Hi,

> 2) the TSC would have to be used as a clocksource.  You don't know the 
> frequency which is the first problem with using the TSC but some systems 
> have a TSC that changes frequencies.

Also note the tsc may stop ticking if the CPU goes sleep in C3, which
IMHO makes the tsc almost useless as clocksource for guests ...

cheers,
  Gerd


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-14 Thread Dong, Eddie

>>> 1) the PIT is periodic.  a PV timer can offer a one shot 
>timer which 
>>> enables dynticks.
>>> 
>>
>> Obviously people have figured out how to do dynticks on real x86
>> hardware, so I don't accept this reason. :)
>>   
>
>Using more advanced timers like the HPET.
>

I'd think there is no reason for virtual timer to be PIT like, which
is mostly due to historic reason.

If we need to make it close to native timer device, HPET is much
better than PIT for virtual time to look like.

thx,eddie

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Hollis Blanchard
On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
> +
> +/* The hypervisor will put information about time periodically here */
> +struct kvm_hv_clock hv_clock;

...

> +void __init kvmclock_init(void)
> +{
> +
> +   unsigned long shared_page = (unsigned long)&hv_clock;

What makes you think this is page-aligned?

> +   /*
> +* If we can't use the paravirt clock, just go with
> +* the usual timekeeping
> +*/
> +   if (!kvm_para_available() || no_kvmclock)
> +   return;
> +
> +   if (kvm_hypercall1(KVM_HCALL_SET_SHARED_PAGE, shared_page))
> +   return;

Here you're passing a virtual address across the guest/host interface,
which is something we've previously agreed is bad.

> @@ -1406,6 +1408,7 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>  
> +static struct kvm_hv_clock hv_clock;
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
> unsigned long nr, a0, a1, a2, a3, ret;
> @@ -1426,7 +1429,32 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> a3 &= 0x;
> }
>  
> +   ret = 0;
> switch (nr) {
> +   case  KVM_HCALL_SET_SHARED_PAGE:
> +   if (!irqchip_in_kernel(vcpu->kvm)) {
> +   ret = -1;
> +   break;
> +   }
> +
> +   vcpu->kvm->clock_addr = a0;
> +   getnstimeofday(&hv_clock.wc);
> +   hv_clock.stable_tsc = check_tsc_unstable();
> +   hv_clock.tsc_mult = clocksource_khz2mult(tsc_khz, 22);
> +   rdtscll(hv_clock.last_tsc);
> +   emulator_write_emulated(vcpu->kvm->clock_addr, &hv_clock,
> +   sizeof(hv_clock), vcpu);

Are you intending this shared page to contain other data in the future,
or is it just poorly named?

> +struct kvm_hv_clock {
> +   int stable_tsc; /* use raw tsc for clock_read */
> +   unsigned long tsc_mult;
> +   struct timespec now;
> +   u64 last_tsc;
> +   /* That's the wall clock, not the water closet */
> +   struct timespec wc;
> +};

This isn't even good for x86: you're using "long" in a binary interface!
You're also using timespec, and what sort of binary compatibility
guarantees would you like to make about that?

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Hollis Blanchard wrote:
> On Fri, 2007-10-12 at 15:02 -0500, Anthony Liguori wrote:
>   
>> Hollis Blanchard wrote:
>> 
>>> On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
>>>   
>>>   
 +config KVM_CLOCK
 +   bool "KVM paravirtualized clock"
 +   depends on PARAVIRT && GENERIC_CLOCKEVENTS
 +   help
 + Turning on this option will allow you to run a paravirtualized 
 clock
 + when running over the KVM hypervisor. Instead of relying on a PIT
 + (or probably other) emulation by the underlying device model, 
 the host
 + provides the guest with timing infrastructure, as time of day, 
 and
 + timer expiration.
 
 
>>> I must have missed earlier discussion on this topic, so I'm left
>>> wondering... what's the point? What's wrong with PIT (et al) emulation?
>>>   
>>>   
>> There are three separate reasons, that I know of, to have a PV timer.
>>
>> 1) the PIT is periodic.  a PV timer can offer a one shot timer which 
>> enables dynticks.
>> 
>
> Obviously people have figured out how to do dynticks on real x86
> hardware, so I don't accept this reason. :)
>   

Using more advanced timers like the HPET.

>> 2) the TSC would have to be used as a clocksource.  You don't know the 
>> frequency which is the first problem with using the TSC but some systems 
>> have a TSC that changes frequencies.  A PV time source gives you more 
>> stable clocksource (although as in glommer's patch, when the TSC can be 
>> used, it's better to use it).
>> 
>
> As I understand it, the TSC is based on CPU frequency, which changes
> with power management. Architectural bug.
>
> However, PV time still doesn't help here:
>   * The TSC is _user_ accessible, so PV time support in the guest
> kernel doesn't solve the problem.
>   * It looks like external agents can perform out-of-kernel
> frequency scaling on x86 (at least I see options for it on IBM
> blades). So there must already exist some mechanism for a kernel
> to be informed that the TSC frequency has been changed.
>   

I don't know if that is scaled transparently to the host OS or just at 
boot time.  Keep in mind too, modern Intel processors have fixed 
frequency TSCs so it's possible that that's only an option for those 
processors.

Regards,

Anthony Liguori

>> 3) a PV clock can support stolen time calculation which there really 
>> isn't a concept of with emulation.
>> 
>
> This is true, and I know other platforms support this functionality. I
> think it's mostly useful for process time accounting. Is that actually
> supported in this patch?
>
>   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Hollis Blanchard
On Fri, 2007-10-12 at 15:02 -0500, Anthony Liguori wrote:
> Hollis Blanchard wrote:
> > On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
> >   
> >> +config KVM_CLOCK
> >> +   bool "KVM paravirtualized clock"
> >> +   depends on PARAVIRT && GENERIC_CLOCKEVENTS
> >> +   help
> >> + Turning on this option will allow you to run a paravirtualized 
> >> clock
> >> + when running over the KVM hypervisor. Instead of relying on a PIT
> >> + (or probably other) emulation by the underlying device model, 
> >> the host
> >> + provides the guest with timing infrastructure, as time of day, 
> >> and
> >> + timer expiration.
> >> 
> >
> > I must have missed earlier discussion on this topic, so I'm left
> > wondering... what's the point? What's wrong with PIT (et al) emulation?
> >   
> 
> There are three separate reasons, that I know of, to have a PV timer.
> 
> 1) the PIT is periodic.  a PV timer can offer a one shot timer which 
> enables dynticks.

Obviously people have figured out how to do dynticks on real x86
hardware, so I don't accept this reason. :)

> 2) the TSC would have to be used as a clocksource.  You don't know the 
> frequency which is the first problem with using the TSC but some systems 
> have a TSC that changes frequencies.  A PV time source gives you more 
> stable clocksource (although as in glommer's patch, when the TSC can be 
> used, it's better to use it).

As I understand it, the TSC is based on CPU frequency, which changes
with power management. Architectural bug.

However, PV time still doesn't help here:
  * The TSC is _user_ accessible, so PV time support in the guest
kernel doesn't solve the problem.
  * It looks like external agents can perform out-of-kernel
frequency scaling on x86 (at least I see options for it on IBM
blades). So there must already exist some mechanism for a kernel
to be informed that the TSC frequency has been changed.

> 3) a PV clock can support stolen time calculation which there really 
> isn't a concept of with emulation.

This is true, and I know other platforms support this functionality. I
think it's mostly useful for process time accounting. Is that actually
supported in this patch?

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Hollis Blanchard wrote:
> On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
>   
>> +config KVM_CLOCK
>> +   bool "KVM paravirtualized clock"
>> +   depends on PARAVIRT && GENERIC_CLOCKEVENTS
>> +   help
>> + Turning on this option will allow you to run a paravirtualized 
>> clock
>> + when running over the KVM hypervisor. Instead of relying on a PIT
>> + (or probably other) emulation by the underlying device model, the 
>> host
>> + provides the guest with timing infrastructure, as time of day, and
>> + timer expiration.
>> 
>
> I must have missed earlier discussion on this topic, so I'm left
> wondering... what's the point? What's wrong with PIT (et al) emulation?
>   

There are three separate reasons, that I know of, to have a PV timer.

1) the PIT is periodic.  a PV timer can offer a one shot timer which 
enables dynticks.

2) the TSC would have to be used as a clocksource.  You don't know the 
frequency which is the first problem with using the TSC but some systems 
have a TSC that changes frequencies.  A PV time source gives you more 
stable clocksource (although as in glommer's patch, when the TSC can be 
used, it's better to use it).

3) a PV clock can support stolen time calculation which there really 
isn't a concept of with emulation.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Hollis Blanchard
On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
> +config KVM_CLOCK
> +   bool "KVM paravirtualized clock"
> +   depends on PARAVIRT && GENERIC_CLOCKEVENTS
> +   help
> + Turning on this option will allow you to run a paravirtualized clock
> + when running over the KVM hypervisor. Instead of relying on a PIT
> + (or probably other) emulation by the underlying device model, the 
> host
> + provides the guest with timing infrastructure, as time of day, and
> + timer expiration.

I must have missed earlier discussion on this topic, so I'm left
wondering... what's the point? What's wrong with PIT (et al) emulation?

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Jeremy Fitzhardinge wrote:
> Glauber de Oliveira Costa wrote:
>   
>> My next TODOs with it are:
>> * Get SMP working
>> * Try something for stolen time, as jeremy's last suggestion for anthony's 
>> patch
>> * Measure the time it takes for a hypercall, and subtract this time
>> for calculating the expiry time for the timer event.
>>   
>> 
>
>   
>> diff --git a/arch/i386/kernel/setup.c b/arch/i386/kernel/setup.c
>> index d474cd6..fd758f9 100644
>> --- a/arch/i386/kernel/setup.c
>> +++ b/arch/i386/kernel/setup.c
>> @@ -46,6 +46,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include 
>>  
>> @@ -579,6 +580,9 @@ void __init setup_arch(char **cmdline_p)
>>  vmi_init();
>>  #endif
>>  
>> +#ifdef CONFIG_KVM_CLOCK
>> +kvmclock_init();
>> +#endif
>>   
>> 
>
> Why is this necessary?  Can't you hook one of the existing pvops?
>   

Something needs to install the pvops in the first place.  Instead of 
adding a kvmclock_init(), it would probably be better to add a single 
kvm_para_init() (as I did in my previous series) and initialize the 
pv_ops for the clock there.

Regards,

Anthony Liguori

> J
>
>   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Jeremy Fitzhardinge
Glauber de Oliveira Costa wrote:
> My next TODOs with it are:
> * Get SMP working
> * Try something for stolen time, as jeremy's last suggestion for anthony's 
> patch
> * Measure the time it takes for a hypercall, and subtract this time
> for calculating the expiry time for the timer event.
>   

I don't think there's much point in trying to do stuff like this.  The
guest can be preempted at any time, so there's an arbitrary amount of
time between deciding to set a timeout, and the time the timeout
actually happens.

In theory you can mitigate this by using an absolute rather than
relative timeout value, but in practice I don't think it makes much
difference.

> +
> +/*
> + * This is our read_clock function. The host puts an tsc timestamp each time
> + * it updates a new time, and then we can use it to derive a slightly more
> + * precise notion of elapsed time, converted to nanoseconds.
> + *
> + * If the platform provides a stable tsc, we just use it, and there is no 
> need
> + * for the host to update anything.


How would you deal with suspend/resume/migrate?  Also, do you assume
that stable_tsc also means synchronized tsc on an SMP host?

> + */
> +static cycle_t kvm_clock_read(void) {
> +
> + u64 delta, last_tsc;
> + struct timespec *now;
> +
> + if (hv_clock.stable_tsc) {
> + rdtscll(last_tsc);
> + return last_tsc;
>   

So this returns a tsc here? >

> + }
> +
> + do {
> + last_tsc = hv_clock.last_tsc;
> + rmb();
> + now = &hv_clock.now;
>   

Shouldn't this be taking a copy of now, rather than a pointer to it? 
Otherwise what's the point of this loop?

> + rmb();
> + } while (hv_clock.last_tsc != last_tsc);
>   

This won't be an atomic compare on 32-bit; it could get confused by
seeing a half-updated tsc value.

> +
> + delta = native_read_tsc() - last_tsc;
> + delta = (delta * hv_clock.tsc_mult) >> KVM_SCALE;
> +
> + return (cycle_t)now->tv_sec * NSEC_PER_SEC + now->tv_nsec + delta;
>   

---> But returns ns here?

> +}
> +
> +static void kvm_timer_set_mode(enum clock_event_mode mode,
> + struct clock_event_device *evt)
> +{
> + WARN_ON(!irqs_disabled());
> +
> + switch (mode) {
> + case CLOCK_EVT_MODE_ONESHOT:
> + /* this is what we want */
> + break;
> + case CLOCK_EVT_MODE_RESUME:
> + break;
> + case CLOCK_EVT_MODE_PERIODIC:
> + WARN_ON(1);
> + break;
> + case CLOCK_EVT_MODE_UNUSED:
> + case CLOCK_EVT_MODE_SHUTDOWN:
> + kvm_hypercall0(KVM_HCALL_STOP_ONESHOT);
> + break;
> + default:
> + break;
> + }
> +}
> +
> +/*
> + * Programming the next event is just a matter of asking the host
> + * to generate us an interrupt when the time expires. We pass the
> + * delta on, and hypervisor will do all remaining tricks. For a more
> + * precise timing, we can just subtract the time spent by the hypercall
>   

Not worthwhile.  It would be better to make the hypercall take an
absolute time, and pass it now+delta.  At least then if you get
preempted past the timeout period you can return -ETIME, and the clock
subsystem will know what to do.

> + */
> +static int kvm_timer_next_event(unsigned long delta,
> + struct clock_event_device *evt)
> +{
> + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT);
> + kvm_hypercall1(KVM_HCALL_SET_ALARM, delta);
> + return 0;
> +}
> +
> diff --git a/arch/i386/kernel/setup.c b/arch/i386/kernel/setup.c
> index d474cd6..fd758f9 100644
> --- a/arch/i386/kernel/setup.c
> +++ b/arch/i386/kernel/setup.c
> @@ -46,6 +46,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -579,6 +580,9 @@ void __init setup_arch(char **cmdline_p)
>   vmi_init();
>  #endif
>  
> +#ifdef CONFIG_KVM_CLOCK
> + kvmclock_init();
> +#endif
>   

Why is this necessary?  Can't you hook one of the existing pvops?


J

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Glauber de Oliveira Costa wrote:
> Hi,
>
> Attached is a first draft to a paravirt implementation for a timer to
> KVM. It is inspired in anthony's last patch about it, but not that
> much based on it.
>
> I'm not using hypercalls to get the current time, but rather,
> registering an address that will get timer updates once in a while.
>
> Also, it includes a clockevent oneshot implementation (which is the
> very thing of this patch), that will allow us interest things like
> dynticks.
>
> It's still not yet working on SMP, and I'm currently not sure why (ok,
> ok, if you actually read the patch, it will become obvious the why: it
> only delivers interrupt for vector 0x20, but I'm further with it, this
> patch is just a snapshot)
>
> My next TODOs with it are:
> * Get SMP working
> * Try something for stolen time, as jeremy's last suggestion for anthony's 
> patch
> * Measure the time it takes for a hypercall, and subtract this time
> for calculating the expiry time for the timer event.
> * Testing and fixing bugs: I'm sure they exist!
>
> Meanwhile, all your suggestions are welcome.
>
>   



> +
> +void __init kvmclock_init(void)
> +{
> +
> + unsigned long shared_page = (unsigned long)&hv_clock;
> + /*
> +  * If we can't use the paravirt clock, just go with
> +  * the usual timekeeping
> +  */
> + if (!kvm_para_available() || no_kvmclock)
> + return;
>   

You should also check kvm_para_has_feature() and define a feature flag 
for the clock.

> + if (kvm_hypercall1(KVM_HCALL_SET_SHARED_PAGE, shared_page))
> + return;
> +
> + paravirt_ops.get_wallclock = kvm_get_wallclock;
> + paravirt_ops.set_wallclock = kvm_set_wallclock;
> + paravirt_ops.sched_clock = kvm_sched_clock;
> + paravirt_ops.time_init = kvm_time_init;
> + /*
> +  * If we let the normal APIC initialization code run, they will
> +  * override our event handler, relying that the APIC will deliver
> +  * the interrupts in the LOCAL_TIMER_VECTOR. The easy solution is
> +  * keep the PIT running until then
> +  */
> + paravirt_ops.setup_boot_clock = kvm_disable_pit;
> +}


> diff --git a/drivers/kvm/irq.c b/drivers/kvm/irq.c
> index 0f663fe..7baf798 100644
> --- a/drivers/kvm/irq.c
> +++ b/drivers/kvm/irq.c
> @@ -32,6 +32,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>  {
>   struct kvm_pic *s;
>  
> + if (v->timer_vector != -1)
> + return 1;
>   if (kvm_apic_has_interrupt(v) == -1) {  /* LAPIC */
>   if (kvm_apic_accept_pic_intr(v)) {
>   s = pic_irqchip(v->kvm);/* PIC */
> @@ -43,6 +45,12 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>  }
>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>  
> +static int kvm_get_pvclock_interrupt(struct kvm_vcpu *v)
> +{
> + int ret = v->timer_vector;
> + v->timer_vector = -1;
> + return  ret;
> +}
>  /*
>   * Read pending interrupt vector and intack.
>   */
> @@ -51,7 +59,9 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>   struct kvm_pic *s;
>   int vector;
>  
> - vector = kvm_get_apic_interrupt(v); /* APIC */
> + vector = kvm_get_pvclock_interrupt(v);
> + if (vector == -1)
> + vector = kvm_get_apic_interrupt(v); /* APIC */
>   

It might be better to just rely on the in-kernel APIC to inject an 
interrupt for the clock (via kvm_pic_set_irq()).

Regards,

Anthony LIguori

>   if (vector == -1) {
>   if (kvm_apic_accept_pic_intr(v)) {
>   s = pic_irqchip(v->kvm)

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel