Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2012-02-21 Thread Peter Lieven

On 15.12.2010 12:53, Ulrich Obergfell wrote:

- Anthony Liguorianth...@codemonkey.ws  wrote:


On 12/14/2010 06:09 AM, Ulrich Obergfell wrote:

[...]


Parts 1 thru 4 of this RFC contain experimental source code which
I recently used to investigate the performance benefit. In a Linux
guest, I was running a program that calls gettimeofday() 'n' times
in a loop (the PM Timer register is read during each call). With
in-kernel PM Timer, I observed a significant reduction of program
execution time.


I've played with this in the past.  Can you post real numbers,
preferably, with a real work load?


Anthony,

I only experimented with a gettimeofday() loop. With this test scenario
I observed that in-kernel PM Timer reduced the program execution time to
roughly half of the execution time that it takes with userspace PM Timer.
Please find some example results below (these results were obtained while
the host was not busy). The relative difference of in-kernel PM Timer
versus userspace PM Timer is high, whereas the absolute difference per
call appears to be low. So, the benefit much depends on how frequently
gettimeofday() is called in a real work load. I don't have any numbers
from a real work load. When I began working on this, I was motivated by
the fact that the Linux kernel itself provides an optimization for the
gettimeofday() call ('vxtime'). So, from this I presumed that there
would be real work loads which would benefit from the optimization of
the gettimeofday() call (otherwise, why would we have 'vxtime' ?).
Of course, 'vxtime' is not related to PM based time keeping. However,
the experimental code shows an approach to optimize gettimeofday() in
KVM virtual machines.


Regards,

Uli


- host:

# grep model name /proc/cpuinfo | sort | uniq -c
   8 model name : Intel(R) Core(TM) i7 CPU   Q 740  @ 1.73GHz

# uname -r
2.6.37-rc4


- guest:

# grep model name /proc/cpuinfo | sort | uniq -c
   4 model name : QEMU Virtual CPU version 0.13.50


- test program ('gtod.c'):

#includesys/time.h
#includestdlib.h

struct timeval tv;

main(int argc, char *argv[])
{
int i = atoi(argv[1]);
while (i--  0)
gettimeofday(tv, NULL);
}


- example results with in-kernel PM Timer:

# for i in 1 2 3

do
time ./gtod 2500
done

real0m44.302s
user0m1.090s
sys 0m43.163s

real0m44.509s
user0m1.100s
sys 0m43.393s

real0m45.290s
user0m1.160s
sys 0m44.123s

# for i in 1000 5000 1

do
time ./gtod $i
done

real0m17.981s
user0m0.810s
sys 0m17.157s

real1m27.253s
user0m1.930s
sys 1m25.307s

real2m51.801s
user0m3.359s
sys 2m48.384s


- example results with userspace PM Timer:

# for i in 1 2 3

do
time ./gtod 2500
done

real1m24.185s
user0m2.000s
sys 1m22.168s

real1m23.508s
user0m1.750s
sys 1m21.738s

real1m24.437s
user0m1.900s
sys 1m22.517s

# for i in 1000 5000 1

do
time ./gtod $i
done

real0m33.479s
user0m0.680s
sys 0m32.785s

real2m50.831s
user0m3.389s
sys 2m47.405s

real5m42.304s
user0m7.319s
sys 5m34.919s
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


i currently analyze a performance regression togehter with Gleb where a 
Windows 7 / Win2008R2 VM hammers the pmtimer approx. 15000 times/s during

I/O. the performance thus is very bad and the cpu is at 100%.

has anyone made any further work on the in-kernel pm timer or a full 
implementation?


would it be possible to rebase this old experimental patch to see if it 
helps in the performance regression we came across?


thank you,
peter
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-15 Thread Avi Kivity

On 12/14/2010 06:04 PM, Anthony Liguori wrote:

On 12/14/2010 09:38 AM, Avi Kivity wrote:
Fortunately, we have a very good bytecode interpreter that's 
accelerated in the kernel called KVM ;-)


We have exactly the same bytecode interpreter under a different name, 
it's called userspace.


If you can afford to make the transition back to the guest for 
emulation, you might as well transition to userspace.


If you re-entered the guest and setup a stack that had the RIP of the 
source of the exit, then there's no additional need to exit the 
guest.  The handler can just do an iret.  Or am I missing something?


I didn't even consider an iret-to-guest, to be honest.  Let's consider 
the options:


 - iret-to-guest (a la tpr patching) - need to have an executable page 
in the guest virtual address space and some stack space (on 64-bit, can 
rely on iretq switching the stack).  That is probably impossible to do 
in a generic way without guest cooperation.  If we rely on guest 
cooperation, we might as well have the guest patch the IN instruction 
itself (no exits at all).


- architectural SMM - no need to find a virtual mapping, or even a 
physical page, since we're in our own physical address space.  However, 
the RSM instruction will trap, and on Intel, at least the first few 
instructions need to be emulated since SMM starts in big real mode.  
Also needs a tlb flush.


- kvm-specific SMM (probably what you referred to as paravirt SMM, but 
if the guest OS is not involved, it's not really paravirt) - can switch 
to our own cr3 so no problem with finding a virtual mapping; however 
still needs a tlb flush, and on pre-NPT/EPT machines, switching cr3 back 
will involve an exit.




We already have a virtual address space that works for most guests 
thanks to the TPR optimization.


It only works for Windows XP and Windows XP with the /3GB extension.


Is this a fundamental limitation or just a statement of today's 
heuristics?  Does any guest not keep the BIOS in virtual memory in a 
static location?


If you're looking for a fundamental limitation, then yes, a guest need 
not map the BIOS at all.  Practically, I believe all common guest do map 
the BIOS, but IIRC modern guests use non-executable mappings.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-15 Thread Ulrich Obergfell

- Anthony Liguori anth...@codemonkey.ws wrote:

 On 12/14/2010 06:09 AM, Ulrich Obergfell wrote:

[...]

  Parts 1 thru 4 of this RFC contain experimental source code which
  I recently used to investigate the performance benefit. In a Linux
  guest, I was running a program that calls gettimeofday() 'n' times
  in a loop (the PM Timer register is read during each call). With
  in-kernel PM Timer, I observed a significant reduction of program
  execution time.
 
 
 I've played with this in the past.  Can you post real numbers, 
 preferably, with a real work load?


Anthony,

I only experimented with a gettimeofday() loop. With this test scenario
I observed that in-kernel PM Timer reduced the program execution time to
roughly half of the execution time that it takes with userspace PM Timer.
Please find some example results below (these results were obtained while
the host was not busy). The relative difference of in-kernel PM Timer
versus userspace PM Timer is high, whereas the absolute difference per
call appears to be low. So, the benefit much depends on how frequently
gettimeofday() is called in a real work load. I don't have any numbers
from a real work load. When I began working on this, I was motivated by
the fact that the Linux kernel itself provides an optimization for the
gettimeofday() call ('vxtime'). So, from this I presumed that there
would be real work loads which would benefit from the optimization of
the gettimeofday() call (otherwise, why would we have 'vxtime' ?).
Of course, 'vxtime' is not related to PM based time keeping. However,
the experimental code shows an approach to optimize gettimeofday() in
KVM virtual machines.


Regards,

Uli


- host:

# grep model name /proc/cpuinfo | sort | uniq -c
  8 model name : Intel(R) Core(TM) i7 CPU   Q 740  @ 1.73GHz

# uname -r
2.6.37-rc4


- guest:

# grep model name /proc/cpuinfo | sort | uniq -c
  4 model name : QEMU Virtual CPU version 0.13.50


- test program ('gtod.c'):

#include sys/time.h
#include stdlib.h

struct timeval tv;

main(int argc, char *argv[])
{
int i = atoi(argv[1]);
while (i--  0)
gettimeofday(tv, NULL);
}


- example results with in-kernel PM Timer:

# for i in 1 2 3
 do
 time ./gtod 2500
 done

real0m44.302s
user0m1.090s
sys 0m43.163s

real0m44.509s
user0m1.100s
sys 0m43.393s

real0m45.290s
user0m1.160s
sys 0m44.123s

# for i in 1000 5000 1
 do
 time ./gtod $i
 done

real0m17.981s
user0m0.810s
sys 0m17.157s

real1m27.253s
user0m1.930s
sys 1m25.307s

real2m51.801s
user0m3.359s
sys 2m48.384s


- example results with userspace PM Timer:

# for i in 1 2 3
 do
 time ./gtod 2500
 done

real1m24.185s
user0m2.000s
sys 1m22.168s

real1m23.508s
user0m1.750s
sys 1m21.738s

real1m24.437s
user0m1.900s
sys 1m22.517s

# for i in 1000 5000 1
 do
 time ./gtod $i
 done

real0m33.479s
user0m0.680s
sys 0m32.785s

real2m50.831s
user0m3.389s
sys 2m47.405s

real5m42.304s
user0m7.319s
sys 5m34.919s
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Ulrich Obergfell

Hi,

This is an RFC through which I would like to get feedback on how the
idea of in-kernel PM Timer would be received.

The current implementation of PM Timer emulation is 'heavy-weight'
because the code resides in qemu userspace. Guest operating systems
that use PM Timer as a clock source (for example, older versions of
Linux that do not have paravirtualized clock) would benefit from an
in-kernel PM Timer emulation.

Parts 1 thru 4 of this RFC contain experimental source code which
I recently used to investigate the performance benefit. In a Linux
guest, I was running a program that calls gettimeofday() 'n' times
in a loop (the PM Timer register is read during each call). With
in-kernel PM Timer, I observed a significant reduction of program
execution time.

The experimental code emulates the PM Timer register in KVM kernel.
All other components of ACPI PM remain in qemu userspace. Also, the
'timer carry interrupt' feature is not implemented in-kernel. If a
guest operating system needs to enable the 'timer carry interrupt',
the code takes care that PM Timer emulation falls back to userspace.
However, I think the design of the code has sufficient flexibility,
so that anyone who would want to add the 'timer carry interrupt'
feature in-kernel could try to do so later on.

Please review and please comment.


Regards,

Uli Obergfell
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Avi Kivity

On 12/14/2010 02:09 PM, Ulrich Obergfell wrote:

Hi,

This is an RFC through which I would like to get feedback on how the
idea of in-kernel PM Timer would be received.

The current implementation of PM Timer emulation is 'heavy-weight'
because the code resides in qemu userspace. Guest operating systems
that use PM Timer as a clock source (for example, older versions of
Linux that do not have paravirtualized clock) would benefit from an
in-kernel PM Timer emulation.

Parts 1 thru 4 of this RFC contain experimental source code which
I recently used to investigate the performance benefit. In a Linux
guest, I was running a program that calls gettimeofday() 'n' times
in a loop (the PM Timer register is read during each call). With
in-kernel PM Timer, I observed a significant reduction of program
execution time.

The experimental code emulates the PM Timer register in KVM kernel.
All other components of ACPI PM remain in qemu userspace. Also, the
'timer carry interrupt' feature is not implemented in-kernel. If a
guest operating system needs to enable the 'timer carry interrupt',
the code takes care that PM Timer emulation falls back to userspace.
However, I think the design of the code has sufficient flexibility,
so that anyone who would want to add the 'timer carry interrupt'
feature in-kernel could try to do so later on.



What is the motivation for this?  Are there any important guests that 
use the pmtimer?


If anything I'd expect hpet or the Microsoft synthetic timers to be a 
lot more important.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Glauber Costa
On Tue, 2010-12-14 at 15:34 +0200, Avi Kivity wrote:
 On 12/14/2010 02:09 PM, Ulrich Obergfell wrote:
  Hi,
 
  This is an RFC through which I would like to get feedback on how the
  idea of in-kernel PM Timer would be received.
 
  The current implementation of PM Timer emulation is 'heavy-weight'
  because the code resides in qemu userspace. Guest operating systems
  that use PM Timer as a clock source (for example, older versions of
  Linux that do not have paravirtualized clock) would benefit from an
  in-kernel PM Timer emulation.
 
  Parts 1 thru 4 of this RFC contain experimental source code which
  I recently used to investigate the performance benefit. In a Linux
  guest, I was running a program that calls gettimeofday() 'n' times
  in a loop (the PM Timer register is read during each call). With
  in-kernel PM Timer, I observed a significant reduction of program
  execution time.
 
  The experimental code emulates the PM Timer register in KVM kernel.
  All other components of ACPI PM remain in qemu userspace. Also, the
  'timer carry interrupt' feature is not implemented in-kernel. If a
  guest operating system needs to enable the 'timer carry interrupt',
  the code takes care that PM Timer emulation falls back to userspace.
  However, I think the design of the code has sufficient flexibility,
  so that anyone who would want to add the 'timer carry interrupt'
  feature in-kernel could try to do so later on.
 
 
 What is the motivation for this?  Are there any important guests that 
 use the pmtimer?
Avi,

All older RHEL and Windows, for example, would benefit for this.

 If anything I'd expect hpet or the Microsoft synthetic timers to be a 
 lot more important.

True. But also a lot more work.
Implementing just the pm timer counter - not the whole of it - in
kernel, gives us a lot of gain with not very much effort. Patch is
pretty simple, as you can see, and most of it is even code to turn it
on/off, etc.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Avi Kivity

On 12/14/2010 03:40 PM, Glauber Costa wrote:


  What is the motivation for this?  Are there any important guests that
  use the pmtimer?
Avi,

All older RHEL and Windows, for example, would benefit for this.


They only benefit from it because we don't provide HPET.  If we did, the 
guests would use HPET in preference to pmtimer, since HPET is so much 
better than pmtimer (yet still sucks in an absolute sense).



  If anything I'd expect hpet or the Microsoft synthetic timers to be a
  lot more important.

True. But also a lot more work.
Implementing just the pm timer counter - not the whole of it - in
kernel, gives us a lot of gain with not very much effort. Patch is
pretty simple, as you can see, and most of it is even code to turn it
on/off, etc.



Partial emulation is not something I like since it causes a fuzzy 
kernel/user boundary.  In this case, transitioning to userspace when 
interrupts are enabled doesn't look so hot.  Are you sure all guests 
that benefit from this don't enable the pmtimer interrupt?  What about 
the transition?  Will we have a time discontinuity when that happens?


What I'd really like to see is this stuff implemented in bytecode, 
unfortunately that's a lot of work which will be very hard to upstream.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Gleb Natapov
On Tue, Dec 14, 2010 at 03:49:37PM +0200, Avi Kivity wrote:
 On 12/14/2010 03:40 PM, Glauber Costa wrote:
 
   What is the motivation for this?  Are there any important guests that
   use the pmtimer?
 Avi,
 
 All older RHEL and Windows, for example, would benefit for this.
 
 They only benefit from it because we don't provide HPET.  If we did,
 the guests would use HPET in preference to pmtimer, since HPET is so
 much better than pmtimer (yet still sucks in an absolute sense).
 
   If anything I'd expect hpet or the Microsoft synthetic timers to be a
   lot more important.
 
 True. But also a lot more work.
 Implementing just the pm timer counter - not the whole of it - in
 kernel, gives us a lot of gain with not very much effort. Patch is
 pretty simple, as you can see, and most of it is even code to turn it
 on/off, etc.
 
 
 Partial emulation is not something I like since it causes a fuzzy
 kernel/user boundary.  In this case, transitioning to userspace when
 interrupts are enabled doesn't look so hot.  Are you sure all guests
 that benefit from this don't enable the pmtimer interrupt?  What
 about the transition?  Will we have a time discontinuity when that
 happens?
 
 What I'd really like to see is this stuff implemented in bytecode,
 unfortunately that's a lot of work which will be very hard to
 upstream.

joke 
Just use ACPI bytecode. It is upstream already.
/joke

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Ulrich Obergfell

- Avi Kivity a...@redhat.com wrote:

 On 12/14/2010 03:40 PM, Glauber Costa wrote:
  
What is the motivation for this?  Are there any important guests that
use the pmtimer?
  Avi,
 
  All older RHEL and Windows, for example, would benefit for this.
 
 They only benefit from it because we don't provide HPET.  If we did, the 
 guests would use HPET in preference to pmtimer, since HPET is so much
 better than pmtimer (yet still sucks in an absolute sense).
 
If anything I'd expect hpet or the Microsoft synthetic timers to be a
lot more important.
 
  True. But also a lot more work.
  Implementing just the pm timer counter - not the whole of it - in
  kernel, gives us a lot of gain with not very much effort. Patch is
  pretty simple, as you can see, and most of it is even code to turn it
  on/off, etc.
 
 
 Partial emulation is not something I like since it causes a fuzzy 
 kernel/user boundary.  In this case, transitioning to userspace when 
 interrupts are enabled doesn't look so hot.  Are you sure all guests 
 that benefit from this don't enable the pmtimer interrupt?  What about
 the transition?  Will we have a time discontinuity when that happens?

Avi,

the idea is to use the '-kvm-pmtmr' option (in code part 4) only
with guests that do not enable the 'timer carry interrupt'. Guests
that need to enable the 'timer carry interrupt' should rather use
the PM Timer emulation in qemu userspace (i.e. they should not be
started with this option). If a guest is accidentally started with
this option, the in-kernel PM Timer (in code part 1) detects if
the guest attempts to enable the 'timer carry interrupt' and falls
back to PM Timer emulation in qemu userspace (in-kernel PM Timer
disables itself automatically). So, this is not a combination of
in-kernel PM Timer register emulation and qemu userspace PM Timer
interrupt emulation.

Regards,

Uli
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Avi Kivity

On 12/14/2010 04:44 PM, Ulrich Obergfell wrote:


  Partial emulation is not something I like since it causes a fuzzy
  kernel/user boundary.  In this case, transitioning to userspace when
  interrupts are enabled doesn't look so hot.  Are you sure all guests
  that benefit from this don't enable the pmtimer interrupt?  What about
  the transition?  Will we have a time discontinuity when that happens?

Avi,

the idea is to use the '-kvm-pmtmr' option (in code part 4) only
with guests that do not enable the 'timer carry interrupt'. Guests
that need to enable the 'timer carry interrupt' should rather use
the PM Timer emulation in qemu userspace (i.e. they should not be
started with this option). If a guest is accidentally started with
this option, the in-kernel PM Timer (in code part 1) detects if
the guest attempts to enable the 'timer carry interrupt' and falls
back to PM Timer emulation in qemu userspace (in-kernel PM Timer
disables itself automatically). So, this is not a combination of
in-kernel PM Timer register emulation and qemu userspace PM Timer
interrupt emulation.



We really try to avoid guest specific parameters.  Having to decide if 
the guest has virtio is bad enough, but going into low level details 
like that is really bad.  The host admin might not even know what 
operating systems its guests run.


A guest might even dual boot two different operating systems.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Anthony Liguori

On 12/14/2010 06:09 AM, Ulrich Obergfell wrote:

Hi,

This is an RFC through which I would like to get feedback on how the
idea of in-kernel PM Timer would be received.

The current implementation of PM Timer emulation is 'heavy-weight'
because the code resides in qemu userspace. Guest operating systems
that use PM Timer as a clock source (for example, older versions of
Linux that do not have paravirtualized clock) would benefit from an
in-kernel PM Timer emulation.

Parts 1 thru 4 of this RFC contain experimental source code which
I recently used to investigate the performance benefit. In a Linux
guest, I was running a program that calls gettimeofday() 'n' times
in a loop (the PM Timer register is read during each call). With
in-kernel PM Timer, I observed a significant reduction of program
execution time.
   


I've played with this in the past.  Can you post real numbers, 
preferably, with a real work load?


Regards,

Anthony Liguori


The experimental code emulates the PM Timer register in KVM kernel.
All other components of ACPI PM remain in qemu userspace. Also, the
'timer carry interrupt' feature is not implemented in-kernel. If a
guest operating system needs to enable the 'timer carry interrupt',
the code takes care that PM Timer emulation falls back to userspace.
However, I think the design of the code has sufficient flexibility,
so that anyone who would want to add the 'timer carry interrupt'
feature in-kernel could try to do so later on.

Please review and please comment.


Regards,

Uli Obergfell
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Anthony Liguori

On 12/14/2010 07:49 AM, Avi Kivity wrote:

On 12/14/2010 03:40 PM, Glauber Costa wrote:


  What is the motivation for this?  Are there any important guests that
  use the pmtimer?
Avi,

All older RHEL and Windows, for example, would benefit for this.


They only benefit from it because we don't provide HPET.  If we did, 
the guests would use HPET in preference to pmtimer, since HPET is so 
much better than pmtimer (yet still sucks in an absolute sense).



  If anything I'd expect hpet or the Microsoft synthetic timers to be a
  lot more important.

True. But also a lot more work.
Implementing just the pm timer counter - not the whole of it - in
kernel, gives us a lot of gain with not very much effort. Patch is
pretty simple, as you can see, and most of it is even code to turn it
on/off, etc.



Partial emulation is not something I like since it causes a fuzzy 
kernel/user boundary.  In this case, transitioning to userspace when 
interrupts are enabled doesn't look so hot.  Are you sure all guests 
that benefit from this don't enable the pmtimer interrupt?  What about 
the transition?  Will we have a time discontinuity when that happens?


What I'd really like to see is this stuff implemented in bytecode, 
unfortunately that's a lot of work which will be very hard to upstream.


Fortunately, we have a very good bytecode interpreter that's accelerated 
in the kernel called KVM ;-)


Why not have the equivalent of a paravirtual SMM mode where we can 
reflect IO exits back to the guest in a well defined way?  It could then 
implement PM timer in terms of HPET or something like that.


We already have a virtual address space that works for most guests 
thanks to the TPR optimization.


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Avi Kivity

On 12/14/2010 05:32 PM, Anthony Liguori wrote:


  If anything I'd expect hpet or the Microsoft synthetic timers to 
be a

  lot more important.

True. But also a lot more work.
Implementing just the pm timer counter - not the whole of it - in
kernel, gives us a lot of gain with not very much effort. Patch is
pretty simple, as you can see, and most of it is even code to turn it
on/off, etc.



Partial emulation is not something I like since it causes a fuzzy 
kernel/user boundary.  In this case, transitioning to userspace when 
interrupts are enabled doesn't look so hot.  Are you sure all guests 
that benefit from this don't enable the pmtimer interrupt?  What 
about the transition?  Will we have a time discontinuity when that 
happens?


What I'd really like to see is this stuff implemented in bytecode, 
unfortunately that's a lot of work which will be very hard to upstream.



Fortunately, we have a very good bytecode interpreter that's 
accelerated in the kernel called KVM ;-)


We have exactly the same bytecode interpreter under a different name, 
it's called userspace.


If you can afford to make the transition back to the guest for 
emulation, you might as well transition to userspace.




Why not have the equivalent of a paravirtual SMM mode where we can 
reflect IO exits back to the guest in a well defined way?  It could 
then implement PM timer in terms of HPET or something like that.


More exits.



We already have a virtual address space that works for most guests 
thanks to the TPR optimization.


It only works for Windows XP and Windows XP with the /3GB extension.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Anthony Liguori

On 12/14/2010 09:38 AM, Avi Kivity wrote:
Fortunately, we have a very good bytecode interpreter that's 
accelerated in the kernel called KVM ;-)


We have exactly the same bytecode interpreter under a different name, 
it's called userspace.


If you can afford to make the transition back to the guest for 
emulation, you might as well transition to userspace.


If you re-entered the guest and setup a stack that had the RIP of the 
source of the exit, then there's no additional need to exit the guest.  
The handler can just do an iret.  Or am I missing something?




Why not have the equivalent of a paravirtual SMM mode where we can 
reflect IO exits back to the guest in a well defined way?  It could 
then implement PM timer in terms of HPET or something like that.


More exits.


Yeah, I should have said, implement in terms of kvmclock so no 
additional exits.




We already have a virtual address space that works for most guests 
thanks to the TPR optimization.


It only works for Windows XP and Windows XP with the /3GB extension.


Is this a fundamental limitation or just a statement of today's 
heuristics?  Does any guest not keep the BIOS in virtual memory in a 
static location?


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread David S. Ahern


On 12/14/10 08:29, Anthony Liguori wrote:

 I recently used to investigate the performance benefit. In a Linux
 guest, I was running a program that calls gettimeofday() 'n' times
 in a loop (the PM Timer register is read during each call). With
 in-kernel PM Timer, I observed a significant reduction of program
 execution time.

 
 I've played with this in the past.  Can you post real numbers,
 preferably, with a real work load?

2 years ago I posted relative comparisons of the time sources for older
RHEL guests:
http://www.mail-archive.com/kvm@vger.kernel.org/msg07231.html

What's the relative speed of the in-kernel pmtimer compared to the PIT?

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Anthony Liguori

On 12/14/2010 12:00 PM, David S. Ahern wrote:


On 12/14/10 08:29, Anthony Liguori wrote:

   

I recently used to investigate the performance benefit. In a Linux
guest, I was running a program that calls gettimeofday() 'n' times
in a loop (the PM Timer register is read during each call). With
in-kernel PM Timer, I observed a significant reduction of program
execution time.

   

I've played with this in the past.  Can you post real numbers,
preferably, with a real work load?
 

2 years ago I posted relative comparisons of the time sources for older
RHEL guests:
http://www.mail-archive.com/kvm@vger.kernel.org/msg07231.html
   


Any time you write a program in userspace that effectively equates to a 
single PIO operation that is easy to emulate, it's going to be 
remarkably faster to implement that PIO emulation in the kernel than in 
userspace because vmexit exit cost dominates the execution path.


But that doesn't tell you what the impact is in real world workloads.  
Before we start pushing all device emulation into the kernel, we need to 
quantify how often gettimeofday() is really called in real workloads.


Regards,

Anthony Liguori


What's the relative speed of the in-kernel pmtimer compared to the PIT?

David
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread David S. Ahern


On 12/14/10 12:49, Anthony Liguori wrote:
 But that doesn't tell you what the impact is in real world workloads. 
 Before we start pushing all device emulation into the kernel, we need to
 quantify how often gettimeofday() is really called in real workloads.

The workload that inspired that example program at its current max load
calls gtod upwards of 1000 times per second. The overhead of
gettimeofday was the biggest factor when comparing performance to bare
metal and esx. That's why I wrote the test program --- boils a complex
product/program to a single system call.

David

 
 Regards,
 
 Anthony Liguori
 
 What's the relative speed of the in-kernel pmtimer compared to the PIT?

 David

 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread Anthony Liguori

On 12/14/2010 01:54 PM, David S. Ahern wrote:


On 12/14/10 12:49, Anthony Liguori wrote:
   

But that doesn't tell you what the impact is in real world workloads.
Before we start pushing all device emulation into the kernel, we need to
quantify how often gettimeofday() is really called in real workloads.
 

The workload that inspired that example program at its current max load
calls gtod upwards of 1000 times per second. The overhead of
gettimeofday was the biggest factor when comparing performance to bare
metal and esx. That's why I wrote the test program --- boils a complex
product/program to a single system call.
   


So the absolute performance impact was on the order of what?

The difference in CPU time of a light weight vs. heavy weight exit 
should be something like 2-3us.  That would mean 2-3ms of CPU time at a 
rate of 1000 per second.


That should be pretty much in the noise.

There are possibly second order effects that might make a large impact 
such as contention with the qemu_mutex.  It's worth doing 
experimentation to see if a non-mutex acquiring fast path in userspace 
also resulted in a significant performance boost.


Regards,

Anthony Liguori


David

   

Regards,

Anthony Liguori

 

What's the relative speed of the in-kernel pmtimer compared to the PIT?

David

   
 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/4] KVM in-kernel PM Timer implementation

2010-12-14 Thread David S. Ahern


On 12/14/10 14:46, Anthony Liguori wrote:
 On 12/14/2010 01:54 PM, David S. Ahern wrote:

 On 12/14/10 12:49, Anthony Liguori wrote:
   
 But that doesn't tell you what the impact is in real world workloads.
 Before we start pushing all device emulation into the kernel, we need to
 quantify how often gettimeofday() is really called in real workloads.
  
 The workload that inspired that example program at its current max load
 calls gtod upwards of 1000 times per second. The overhead of
 gettimeofday was the biggest factor when comparing performance to bare
 metal and esx. That's why I wrote the test program --- boils a complex
 product/program to a single system call.

 
 So the absolute performance impact was on the order of what?

At the time I did the investigations (18-24 months ago) KVM was on the
order of 15-20% worse for a RHEL4 based workload and the overhead
appeared to be due to the PIT or PM timer as the clock source. Switching
the clock to the TSC brought the performance on par with bare metal, but
that route has other issues.

 
 The difference in CPU time of a light weight vs. heavy weight exit
 should be something like 2-3us.  That would mean 2-3ms of CPU time at a
 rate of 1000 per second.

The PIT causes 3 VMEXITs for each gettimeofday (get_offset_pit in RHEL4):

/* timer count may underflow right here */
outb_p(0x00, PIT_MODE); /* latch the count ASAP */
...
count = inb_p(PIT_CH0); /* read the latched count */
...
count |= inb_p(PIT_CH0)  8;
...


David


 
 That should be pretty much in the noise.
 
 There are possibly second order effects that might make a large impact
 such as contention with the qemu_mutex.  It's worth doing
 experimentation to see if a non-mutex acquiring fast path in userspace
 also resulted in a significant performance boost.
 
 Regards,
 
 Anthony Liguori
 
 David

   
 Regards,

 Anthony Liguori

 
 What's the relative speed of the in-kernel pmtimer compared to the PIT?

 David


  
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html