Re: [libvirt] TSC scaling interface to management

2012-09-28 Thread Marcelo Tosatti
On Tue, Sep 25, 2012 at 11:08:58AM +0100, Daniel P. Berrange wrote:
 On Wed, Sep 12, 2012 at 12:39:39PM -0300, Marcelo Tosatti wrote:
  
  
  HW TSC scaling is a feature of AMD processors that allows a
  multiplier to be specified to the TSC frequency exposed to the guest.
  
  KVM also contains provision to trap TSC (KVM: Infrastructure for
  software and hardware based TSC rate scaling cc578287e3224d0da)
  or advance TSC frequency.
  
  This is useful when migrating to a host with different frequency and
  the guest is possibly using direct RDTSC instructions for purposes
  other than measuring cycles (that is, it previously calculated
  cycles-per-second, and uses that information which is stale after
  migration).
  
  qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
  added support for tsc_khz= option in QEMU.
  
  I am proposing the following changes so that management applications
  can work with this:
  
  1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
  option). Host means that QEMU is responsible for retrieving the 
  TSC frequency of the host processor and use that.
  Management application does not have to deal with the burden.
 
 FYI, libvirt already has support for expressing a number of different
 TSC related config options, for support of Xen and VMWare's capabilities
 in this area. What we currently allow for is
 
timer name='tsc' frequency='NNN'  mode='auto|native|emulate|smpsafe'/
 
 In this context the frequency attribute provides the HZ value to
 provide to the guest.
 
   - auto == Emulate if TSC is unstable, else allow native TSC access
   - native == Always allow native TSC access
   - emulate = Always emulate TSC
   - smpsafe == Always emulate TSC, and interlock SMP

These options can be mapped into KVM if necessary (they can map to
tsc_khz=XXX or to the module options (unfortunately not per-guest ATM)).

  Therefore it appears that this tsc_khz=auto option can be specified
  only if the user specifies so (it can be a per-guest flag hidden
  in the management configuration/manual).
  
  Sending this email to gather suggestions (or objections)
  to this interface.
 
 
 Daniel
 -- 
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Karen had the suggestion to remove the burden of choice from the user,
which we can achieve by knowing whether or not the guest is using
a paravirtual clock.

The problem is that opens a can of races: Did migration happen before or
after guest boot process enabled the paravirtual clock etc.

I suppose leaving the option to the user is fine: if you run an obscure
operating system, which does not support paravirtual clock, then it
must be dealt with specialy (its in the manual, no big deal).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] TSC scaling interface to management

2012-09-25 Thread Daniel P. Berrange
On Wed, Sep 12, 2012 at 12:39:39PM -0300, Marcelo Tosatti wrote:
 
 
 HW TSC scaling is a feature of AMD processors that allows a
 multiplier to be specified to the TSC frequency exposed to the guest.
 
 KVM also contains provision to trap TSC (KVM: Infrastructure for
 software and hardware based TSC rate scaling cc578287e3224d0da)
 or advance TSC frequency.
 
 This is useful when migrating to a host with different frequency and
 the guest is possibly using direct RDTSC instructions for purposes
 other than measuring cycles (that is, it previously calculated
 cycles-per-second, and uses that information which is stale after
 migration).
 
 qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
 added support for tsc_khz= option in QEMU.
 
 I am proposing the following changes so that management applications
 can work with this:
 
 1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
 option). Host means that QEMU is responsible for retrieving the 
 TSC frequency of the host processor and use that.
 Management application does not have to deal with the burden.

FYI, libvirt already has support for expressing a number of different
TSC related config options, for support of Xen and VMWare's capabilities
in this area. What we currently allow for is

   timer name='tsc' frequency='NNN'  mode='auto|native|emulate|smpsafe'/

In this context the frequency attribute provides the HZ value to
provide to the guest.

  - auto == Emulate if TSC is unstable, else allow native TSC access
  - native == Always allow native TSC access
  - emulate = Always emulate TSC
  - smpsafe == Always emulate TSC, and interlock SMP

 Therefore it appears that this tsc_khz=auto option can be specified
 only if the user specifies so (it can be a per-guest flag hidden
 in the management configuration/manual).
 
 Sending this email to gather suggestions (or objections)
 to this interface.


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] TSC scaling interface to management

2012-09-23 Thread Dor Laor

On 09/23/2012 04:06 AM, Marcelo Tosatti wrote:

On Fri, Sep 21, 2012 at 11:30:31PM +0300, Dor Laor wrote:

On 09/21/2012 05:51 AM, Marcelo Tosatti wrote:

On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote:

On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:



HW TSC scaling is a feature of AMD processors that allows a
multiplier to be specified to the TSC frequency exposed to the guest.

KVM also contains provision to trap TSC (KVM: Infrastructure for
software and hardware based TSC rate scaling cc578287e3224d0da)
or advance TSC frequency.

This is useful when migrating to a host with different frequency and
the guest is possibly using direct RDTSC instructions for purposes
other than measuring cycles (that is, it previously calculated
cycles-per-second, and uses that information which is stale after
migration).

qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
added support for tsc_khz= option in QEMU.

I am proposing the following changes so that management applications
can work with this:

1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
option). Host means that QEMU is responsible for retrieving the
TSC frequency of the host processor and use that.
Management application does not have to deal with the burden.

2) New subsection with tsc_khz value. Destination host should consult
supported features of running kernel and fail if feature is unsupported.


It is not necessary to use this tsc_khz setting with modern guests
using paravirtual clocks, or when its known that applications make
proper use of the time interface provided by operating systems.

On the other hand, legacy applications or setups which require no
modification and correct operation while virtualized and make
use of RDTSC might need this.

Therefore it appears that this tsc_khz=auto option can be specified
only if the user specifies so (it can be a per-guest flag hidden
in the management configuration/manual).

Sending this email to gather suggestions (or objections)
to this interface.


I'm not sure I understand the exact difference between the offers.
We can define these 3 options:

1. Qemu/kvm won't make use of tsc scaling feature at all.
2. tsc scaling is used and we take the value either from the host or
from the live migration data that overrides the later for incoming.
As you've said, it should be passed through a sub section.
3. Manual setting of the value (uncommon).

Is there another option worth considering?
The questions is what should be the default. IMHO #2 is more
appropriate to serve as a default since we do expect tsc to change
between hosts.


Option 1. is more appropriate to serve as a default given that
modern guests make use of paravirt, as you have observed.


but you also observed that legacy applications that use rdtsc (even
over pv kernel) will still be affected by the physical tsc
frequency. Since I'm not aware of downside for using scaling, I
rather pick opt #2 as a default.


The downside is that, if your destination host does not support tsc
scaling, two possibilities arise:

1) destination tsc frequency  source tsc frequency: TSC trap
2) destination tsc frequency  source tsc frequency: TSC catchup

TSC trapping is not wanted, because it is slow.
This is the downside.

Note Intel does not support tsc scaling.


TSC scaling should happen on default on cpu models that don't support 
it. As you mention, it's too costly to emulate. At least that should be 
the default - tsc scaling on in case the processor support it and the 
opposite.


Eventually, if the processor supports scaling we should enable it and 
send the sub section on migration.



That is, tsc scaling is only required if the guest does direct RDTSC
on the expectation that the value won't change.


Cheers,
Dor

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
libvir-list mailing list
libvir-l...@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] TSC scaling interface to management

2012-09-22 Thread Marcelo Tosatti
On Fri, Sep 21, 2012 at 11:30:31PM +0300, Dor Laor wrote:
 On 09/21/2012 05:51 AM, Marcelo Tosatti wrote:
 On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote:
 On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:
 
 
 HW TSC scaling is a feature of AMD processors that allows a
 multiplier to be specified to the TSC frequency exposed to the guest.
 
 KVM also contains provision to trap TSC (KVM: Infrastructure for
 software and hardware based TSC rate scaling cc578287e3224d0da)
 or advance TSC frequency.
 
 This is useful when migrating to a host with different frequency and
 the guest is possibly using direct RDTSC instructions for purposes
 other than measuring cycles (that is, it previously calculated
 cycles-per-second, and uses that information which is stale after
 migration).
 
 qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
 added support for tsc_khz= option in QEMU.
 
 I am proposing the following changes so that management applications
 can work with this:
 
 1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
 option). Host means that QEMU is responsible for retrieving the
 TSC frequency of the host processor and use that.
 Management application does not have to deal with the burden.
 
 2) New subsection with tsc_khz value. Destination host should consult
 supported features of running kernel and fail if feature is unsupported.
 
 
 It is not necessary to use this tsc_khz setting with modern guests
 using paravirtual clocks, or when its known that applications make
 proper use of the time interface provided by operating systems.
 
 On the other hand, legacy applications or setups which require no
 modification and correct operation while virtualized and make
 use of RDTSC might need this.
 
 Therefore it appears that this tsc_khz=auto option can be specified
 only if the user specifies so (it can be a per-guest flag hidden
 in the management configuration/manual).
 
 Sending this email to gather suggestions (or objections)
 to this interface.
 
 I'm not sure I understand the exact difference between the offers.
 We can define these 3 options:
 
 1. Qemu/kvm won't make use of tsc scaling feature at all.
 2. tsc scaling is used and we take the value either from the host or
 from the live migration data that overrides the later for incoming.
 As you've said, it should be passed through a sub section.
 3. Manual setting of the value (uncommon).
 
 Is there another option worth considering?
 The questions is what should be the default. IMHO #2 is more
 appropriate to serve as a default since we do expect tsc to change
 between hosts.
 
 Option 1. is more appropriate to serve as a default given that
 modern guests make use of paravirt, as you have observed.
 
 but you also observed that legacy applications that use rdtsc (even
 over pv kernel) will still be affected by the physical tsc
 frequency. Since I'm not aware of downside for using scaling, I
 rather pick opt #2 as a default.

The downside is that, if your destination host does not support tsc
scaling, two possibilities arise:

1) destination tsc frequency  source tsc frequency: TSC trap
2) destination tsc frequency  source tsc frequency: TSC catchup 

TSC trapping is not wanted, because it is slow.
This is the downside.

Note Intel does not support tsc scaling.

 That is, tsc scaling is only required if the guest does direct RDTSC
 on the expectation that the value won't change.
 
 Cheers,
 Dor
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] TSC scaling interface to management

2012-09-21 Thread Dor Laor

On 09/21/2012 05:51 AM, Marcelo Tosatti wrote:

On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote:

On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:



HW TSC scaling is a feature of AMD processors that allows a
multiplier to be specified to the TSC frequency exposed to the guest.

KVM also contains provision to trap TSC (KVM: Infrastructure for
software and hardware based TSC rate scaling cc578287e3224d0da)
or advance TSC frequency.

This is useful when migrating to a host with different frequency and
the guest is possibly using direct RDTSC instructions for purposes
other than measuring cycles (that is, it previously calculated
cycles-per-second, and uses that information which is stale after
migration).

qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
added support for tsc_khz= option in QEMU.

I am proposing the following changes so that management applications
can work with this:

1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
option). Host means that QEMU is responsible for retrieving the
TSC frequency of the host processor and use that.
Management application does not have to deal with the burden.

2) New subsection with tsc_khz value. Destination host should consult
supported features of running kernel and fail if feature is unsupported.


It is not necessary to use this tsc_khz setting with modern guests
using paravirtual clocks, or when its known that applications make
proper use of the time interface provided by operating systems.

On the other hand, legacy applications or setups which require no
modification and correct operation while virtualized and make
use of RDTSC might need this.

Therefore it appears that this tsc_khz=auto option can be specified
only if the user specifies so (it can be a per-guest flag hidden
in the management configuration/manual).

Sending this email to gather suggestions (or objections)
to this interface.


I'm not sure I understand the exact difference between the offers.
We can define these 3 options:

1. Qemu/kvm won't make use of tsc scaling feature at all.
2. tsc scaling is used and we take the value either from the host or
from the live migration data that overrides the later for incoming.
As you've said, it should be passed through a sub section.
3. Manual setting of the value (uncommon).

Is there another option worth considering?
The questions is what should be the default. IMHO #2 is more
appropriate to serve as a default since we do expect tsc to change
between hosts.


Option 1. is more appropriate to serve as a default given that
modern guests make use of paravirt, as you have observed.


but you also observed that legacy applications that use rdtsc (even over 
pv kernel) will still be affected by the physical tsc frequency. Since 
I'm not aware of downside for using scaling, I rather pick opt #2 as a 
default.




That is, tsc scaling is only required if the guest does direct RDTSC
on the expectation that the value won't change.


Cheers,
Dor

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] TSC scaling interface to management

2012-09-20 Thread Dor Laor

On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:



HW TSC scaling is a feature of AMD processors that allows a
multiplier to be specified to the TSC frequency exposed to the guest.

KVM also contains provision to trap TSC (KVM: Infrastructure for
software and hardware based TSC rate scaling cc578287e3224d0da)
or advance TSC frequency.

This is useful when migrating to a host with different frequency and
the guest is possibly using direct RDTSC instructions for purposes
other than measuring cycles (that is, it previously calculated
cycles-per-second, and uses that information which is stale after
migration).

qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
added support for tsc_khz= option in QEMU.

I am proposing the following changes so that management applications
can work with this:

1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
option). Host means that QEMU is responsible for retrieving the
TSC frequency of the host processor and use that.
Management application does not have to deal with the burden.

2) New subsection with tsc_khz value. Destination host should consult
supported features of running kernel and fail if feature is unsupported.


It is not necessary to use this tsc_khz setting with modern guests
using paravirtual clocks, or when its known that applications make
proper use of the time interface provided by operating systems.

On the other hand, legacy applications or setups which require no
modification and correct operation while virtualized and make
use of RDTSC might need this.

Therefore it appears that this tsc_khz=auto option can be specified
only if the user specifies so (it can be a per-guest flag hidden
in the management configuration/manual).

Sending this email to gather suggestions (or objections)
to this interface.


I'm not sure I understand the exact difference between the offers.
We can define these 3 options:

1. Qemu/kvm won't make use of tsc scaling feature at all.
2. tsc scaling is used and we take the value either from the host or
   from the live migration data that overrides the later for incoming.
   As you've said, it should be passed through a sub section.
3. Manual setting of the value (uncommon).

Is there another option worth considering?
The questions is what should be the default. IMHO #2 is more appropriate 
to serve as a default since we do expect tsc to change between hosts.


Cheers,
Dor
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] TSC scaling interface to management

2012-09-20 Thread Marcelo Tosatti
On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote:
 On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:
 
 
 HW TSC scaling is a feature of AMD processors that allows a
 multiplier to be specified to the TSC frequency exposed to the guest.
 
 KVM also contains provision to trap TSC (KVM: Infrastructure for
 software and hardware based TSC rate scaling cc578287e3224d0da)
 or advance TSC frequency.
 
 This is useful when migrating to a host with different frequency and
 the guest is possibly using direct RDTSC instructions for purposes
 other than measuring cycles (that is, it previously calculated
 cycles-per-second, and uses that information which is stale after
 migration).
 
 qemu-x86: Set tsc_khz in kvm when supported (e7429073ed1a76518)
 added support for tsc_khz= option in QEMU.
 
 I am proposing the following changes so that management applications
 can work with this:
 
 1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
 option). Host means that QEMU is responsible for retrieving the
 TSC frequency of the host processor and use that.
 Management application does not have to deal with the burden.
 
 2) New subsection with tsc_khz value. Destination host should consult
 supported features of running kernel and fail if feature is unsupported.
 
 
 It is not necessary to use this tsc_khz setting with modern guests
 using paravirtual clocks, or when its known that applications make
 proper use of the time interface provided by operating systems.
 
 On the other hand, legacy applications or setups which require no
 modification and correct operation while virtualized and make
 use of RDTSC might need this.
 
 Therefore it appears that this tsc_khz=auto option can be specified
 only if the user specifies so (it can be a per-guest flag hidden
 in the management configuration/manual).
 
 Sending this email to gather suggestions (or objections)
 to this interface.
 
 I'm not sure I understand the exact difference between the offers.
 We can define these 3 options:
 
 1. Qemu/kvm won't make use of tsc scaling feature at all.
 2. tsc scaling is used and we take the value either from the host or
from the live migration data that overrides the later for incoming.
As you've said, it should be passed through a sub section.
 3. Manual setting of the value (uncommon).
 
 Is there another option worth considering?
 The questions is what should be the default. IMHO #2 is more
 appropriate to serve as a default since we do expect tsc to change
 between hosts.

Option 1. is more appropriate to serve as a default given that
modern guests make use of paravirt, as you have observed.

That is, tsc scaling is only required if the guest does direct RDTSC
on the expectation that the value won't change.

 Cheers,
 Dor
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html