Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Jan Beulich
>>> On 12.06.17 at 16:02,  wrote:
> My original statement was "if the guest uses LBR/LER, then migration
> needs to be restricted to hardware with an identical LBR format".
> 
> You countered that, saying we could emulate LBR/LER as an alternative. 
> The implication here is that we could alter the LBR format via
> emulation, by cooking the value observed when the guest reads the LBR MSRs.
> 
> For the record, the formats are:
> 
> Software should query an architectural MSR IA32_PERF_CAPABILITIES[5:0]
> about the format of the address that is stored in the LBR stack. Four
> formats are defined by the following encoding:
> * 00B (32-bit record format) — Stores 32-bit offset in current CS of
> respective source/destination,
> * 01B (64-bit LIP record format) — Stores 64-bit linear address of
> respective source/destination,
> * 10B (64-bit EIP record format) — Stores 64-bit offset (effective
> address) of respective source/destination.
> * 11B (64-bit EIP record format) and Flags — Stores 64-bit offset
> (effective address) of respective source/destination. Misprediction info
> is reported in the upper bit of 'FROM' registers in the LBR stack. See
> LBR stack details below for flag support and definition.
> * 000100B (64-bit EIP record format), Flags and TSX — Stores 64-bit
> offset (effective address) of respective source/destination.
> Misprediction and TSX info are reported in the upper bits of ‘FROM’
> registers in the LBR stack.
> * 000101B (64-bit EIP record format), Flags, TSX, LBR_INFO — Stores
> 64-bit offset (effective address) of respective source/destination.
> Misprediction, TSX, and elapsed cycles since the last LBR update are
> reported in the LBR_INFO MSR stack.
> * 000110B (64-bit EIP record format), Flags, Cycles — Stores 64-bit
> linear address (CS.Base + effective address) of respective
> source/destination. Misprediction info is reported in the upper bits of
> 17-16 Vol. 3BDEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING
> FEATURES 'FROM' registers in the LBR stack. Elapsed cycles since the
> last LBR update are reported in the upper 16 bits of the 'TO' registers
> in the LBR stack (see Section 17.6).
> 
> In general, I don't see any sensible way of being able to convert
> between these formats at the point of an RDMSR.

Hmm, I don't see a problem converting formats 3..6 to formats 0
or 2. I also don't think any misbehavior can possibly result when
converting 2 to 3 by simply always loading a fixed value into the
mis-prediction bit. Whether 2 can be converted sensibly to 4..6
would need to be determined. Format 1 clearly is the odd one out,
conversion to/from which would only be reasonable if we assumed
flat addressing everywhere (which obviously we can assume as
long as a guest stays in 64-bit mode).

It is also clear that format 6 won't survive the addition of 5-level
page tables, as there aren't enough bits to store a meaningful
cycle count.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Andrew Cooper
On 12/06/17 14:42, Jan Beulich wrote:
 On 12.06.17 at 15:36,  wrote:
>> On 12/06/17 14:29, Jan Beulich wrote:
>> On 12.06.17 at 15:07,  wrote:
 On 08/06/17 14:47, Jan Beulich wrote:
 On 08.06.17 at 15:12,  wrote:
>> The `disable_migrate` field shall be dropped.  The concept of 
>> migrateability
>> is not boolean; it is a large spectrum, all of which needs to be managed 
>> by
>> the toolstack.  The simple case is picking the common subset of features
>> between the source and destination.  This becomes more complicated e.g. 
>> if 
>> the
>> guest uses LBR/LER, at which point the toolstack needs to consider 
>> hardware
>> with the same LBR/LER format in addition to just the plain features.
> Not sure about this - by intercepting the MSR accesses to the involved
> MSRs, it would be possible to mimic the LBR/LER format expected by
> the guest even if different from that of the host.
 LER yes, but how would you emulate LBR?

 You could set DBG_CTL.BTF/EFLAGS.TF and intercept #DB, but this would be
 visible to the guest via pushf/popf.  It would also interfere with a
 guest trying to single-step itself.
>>> I don't understand: LBR is an MSR just like LER, and hence the
>>> guest can't avoid using RDMSR to read its contents. If we
>>> intercept that read, we can give them whatever format is
>>> needed, without a need to intercept anything else. But maybe
>>> I'm not seeing what you're getting at.
>> To emulate it, we need to sample state at the point that the last
>> exception or branch happened.
>>
>> You can't reverse the current value in hardware at the point of the
>> guest reading the LBR MSR to the value it should have been under a
>> different format.
> Aren't we talking about correct (or at least unproblematic) top
> bits of the value only? In which case the actual address bits
> can be taken as is, and only the top bits need adjustment.

I'm completely confused.

My original statement was "if the guest uses LBR/LER, then migration
needs to be restricted to hardware with an identical LBR format".

You countered that, saying we could emulate LBR/LER as an alternative. 
The implication here is that we could alter the LBR format via
emulation, by cooking the value observed when the guest reads the LBR MSRs.

For the record, the formats are:

Software should query an architectural MSR IA32_PERF_CAPABILITIES[5:0]
about the format of the address that is stored in the LBR stack. Four
formats are defined by the following encoding:
* 00B (32-bit record format) — Stores 32-bit offset in current CS of
respective source/destination,
* 01B (64-bit LIP record format) — Stores 64-bit linear address of
respective source/destination,
* 10B (64-bit EIP record format) — Stores 64-bit offset (effective
address) of respective source/destination.
* 11B (64-bit EIP record format) and Flags — Stores 64-bit offset
(effective address) of respective source/destination. Misprediction info
is reported in the upper bit of 'FROM' registers in the LBR stack. See
LBR stack details below for flag support and definition.
* 000100B (64-bit EIP record format), Flags and TSX — Stores 64-bit
offset (effective address) of respective source/destination.
Misprediction and TSX info are reported in the upper bits of ‘FROM’
registers in the LBR stack.
* 000101B (64-bit EIP record format), Flags, TSX, LBR_INFO — Stores
64-bit offset (effective address) of respective source/destination.
Misprediction, TSX, and elapsed cycles since the last LBR update are
reported in the LBR_INFO MSR stack.
* 000110B (64-bit EIP record format), Flags, Cycles — Stores 64-bit
linear address (CS.Base + effective address) of respective
source/destination. Misprediction info is reported in the upper bits of
17-16 Vol. 3BDEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING
FEATURES 'FROM' registers in the LBR stack. Elapsed cycles since the
last LBR update are reported in the upper 16 bits of the 'TO' registers
in the LBR stack (see Section 17.6).

In general, I don't see any sensible way of being able to convert
between these formats at the point of an RDMSR.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Jan Beulich
>>> On 12.06.17 at 15:36,  wrote:
> On 12/06/17 14:29, Jan Beulich wrote:
> On 12.06.17 at 15:07,  wrote:
>>> On 08/06/17 14:47, Jan Beulich wrote:
>>> On 08.06.17 at 15:12,  wrote:
> The `disable_migrate` field shall be dropped.  The concept of 
> migrateability
> is not boolean; it is a large spectrum, all of which needs to be managed 
> by
> the toolstack.  The simple case is picking the common subset of features
> between the source and destination.  This becomes more complicated e.g. 
> if 
> the
> guest uses LBR/LER, at which point the toolstack needs to consider 
> hardware
> with the same LBR/LER format in addition to just the plain features.
 Not sure about this - by intercepting the MSR accesses to the involved
 MSRs, it would be possible to mimic the LBR/LER format expected by
 the guest even if different from that of the host.
>>> LER yes, but how would you emulate LBR?
>>>
>>> You could set DBG_CTL.BTF/EFLAGS.TF and intercept #DB, but this would be
>>> visible to the guest via pushf/popf.  It would also interfere with a
>>> guest trying to single-step itself.
>> I don't understand: LBR is an MSR just like LER, and hence the
>> guest can't avoid using RDMSR to read its contents. If we
>> intercept that read, we can give them whatever format is
>> needed, without a need to intercept anything else. But maybe
>> I'm not seeing what you're getting at.
> 
> To emulate it, we need to sample state at the point that the last
> exception or branch happened.
> 
> You can't reverse the current value in hardware at the point of the
> guest reading the LBR MSR to the value it should have been under a
> different format.

Aren't we talking about correct (or at least unproblematic) top
bits of the value only? In which case the actual address bits
can be taken as is, and only the top bits need adjustment.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Andrew Cooper
On 12/06/17 14:29, Jan Beulich wrote:
 On 12.06.17 at 15:07,  wrote:
>> On 08/06/17 14:47, Jan Beulich wrote:
>> On 08.06.17 at 15:12,  wrote:
 The `disable_migrate` field shall be dropped.  The concept of 
 migrateability
 is not boolean; it is a large spectrum, all of which needs to be managed by
 the toolstack.  The simple case is picking the common subset of features
 between the source and destination.  This becomes more complicated e.g. if 
 the
 guest uses LBR/LER, at which point the toolstack needs to consider hardware
 with the same LBR/LER format in addition to just the plain features.
>>> Not sure about this - by intercepting the MSR accesses to the involved
>>> MSRs, it would be possible to mimic the LBR/LER format expected by
>>> the guest even if different from that of the host.
>> LER yes, but how would you emulate LBR?
>>
>> You could set DBG_CTL.BTF/EFLAGS.TF and intercept #DB, but this would be
>> visible to the guest via pushf/popf.  It would also interfere with a
>> guest trying to single-step itself.
> I don't understand: LBR is an MSR just like LER, and hence the
> guest can't avoid using RDMSR to read its contents. If we
> intercept that read, we can give them whatever format is
> needed, without a need to intercept anything else. But maybe
> I'm not seeing what you're getting at.

To emulate it, we need to sample state at the point that the last
exception or branch happened.

You can't reverse the current value in hardware at the point of the
guest reading the LBR MSR to the value it should have been under a
different format.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Jan Beulich
>>> On 12.06.17 at 15:07,  wrote:
> On 08/06/17 14:47, Jan Beulich wrote:
> On 08.06.17 at 15:12,  wrote:
>>> The `disable_migrate` field shall be dropped.  The concept of migrateability
>>> is not boolean; it is a large spectrum, all of which needs to be managed by
>>> the toolstack.  The simple case is picking the common subset of features
>>> between the source and destination.  This becomes more complicated e.g. if 
>>> the
>>> guest uses LBR/LER, at which point the toolstack needs to consider hardware
>>> with the same LBR/LER format in addition to just the plain features.
>> Not sure about this - by intercepting the MSR accesses to the involved
>> MSRs, it would be possible to mimic the LBR/LER format expected by
>> the guest even if different from that of the host.
> 
> LER yes, but how would you emulate LBR?
> 
> You could set DBG_CTL.BTF/EFLAGS.TF and intercept #DB, but this would be
> visible to the guest via pushf/popf.  It would also interfere with a
> guest trying to single-step itself.

I don't understand: LBR is an MSR just like LER, and hence the
guest can't avoid using RDMSR to read its contents. If we
intercept that read, we can give them whatever format is
needed, without a need to intercept anything else. But maybe
I'm not seeing what you're getting at.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Andrew Cooper
On 09/06/17 13:24, Anshul Makkar wrote:
> On 08/06/2017 14:12, Andrew Cooper wrote:
>> Presented herewith is the a plan for the final part of CPUID work, which
>> primarily covers better Xen/Toolstack interaction for configuring the
>> guests
>> CPUID policy.
>>
>> A PDF version of this document is available from:
>>
>> http://xenbits.xen.org/people/andrewcoop/cpuid-part-3.pdf
>>
>> There are a number of still-open questions, which I would appreaciate
>> views
>> on.
>>
>> ~Andrew
>>
>>
>> # Proposal
>>
>> First and foremost, split the current **max\_policy** notion into
>> separate
>> **max** and **default** policies.  This allows for the provision of
>> features
>> which are unused by default, but may be opted in to, both at the
>> hypervisor
>> level and the toolstack level.
>>
>> At the hypervisor level, **max** constitutes all the features Xen can
>> use on
>> the current hardware, while **default** is the subset thereof which are
>> supported features, the features which the user has explicitly opted
>> in to,
>> and excluding any features the user has explicitly opted out of.
>>
>> A new `cpuid=` command line option shall be introduced, whose
>> internals are
>> generated automatically from the featureset ABI.  This means that all
>> features
>> added to `include/public/arch-x86/cpufeatureset.h` automatically gain
>> command
>> line control.  (RFC: The same top level option can probably be used for
>> non-feature CPUID data control, although I can't currently think of
>> any cases
>> where this would be used Also find a sensible way to express
>> 'available but
>> not to be used by Xen', as per the current `smep` and `smap` options.)
>>
>>
>> At the guest level, **max** constitutes all the features which can be
>> offered
>> to each type of guest on this hardware.  Derived from Xen's **default**
>> policy, it includes the supported features and explicitly opted in to
>> features, which are appropriate for the guest.
>>
>> The guests **default** policy is then derived from its **max**, and
>> includes
>> the supported features which are considered migration safe.  (RFC: This
>> distinction is rather fuzzy, but for example it wouldn't include
>> things like
>> ITSC by default, as that is likely to go wrong unless special care is
>> taken.)
>>
> Just from other perspective, what happens to the features which have
> been explicilty selected and are not migration safe ? Do, we consider
> them in guest's default policy.

Explicitly selected where?

Explicit selection at the Xen level is for using experimental/preview
features, while explicit selection at the toolstack level is for both
experimental/preview features, and using features which require more
care wrt migration.

>
>> All global policies (Xen and guest, max and default) shall be made
>> available
>> to the toolstack, in a manner similar to the existing
> Instead of all, do you see any harm if we expose only the default
> policies of Xen and Guest to toolstack.

The entire point of this work is to provide the toolstack with enough
information to work correctly.  Hiding the max policies is not an
option, as it prevents the toolstack from being able to work out whether
it can offer non-default features or not.

>> _XEN\_SYSCTL\_get\_cpu\_featureset_ mechanism.  This allows decisions
>> to be
>> taken which include all CPUID data, not just the feature bitmaps.
>>
>> New _XEN\_DOMCTL\_{get,set}\_cpuid\_policy_ hypercalls will be
>> introduced,
>> which allows the toolstack to query and set the cpuid policy for a
>> specific
>> domain.  It shall supersede _XEN\_DOMCTL\_set\_cpuid_, shall fail if
>> Xen is
>> unhappy with any aspect of the policy during auditing.
>>
>> When a domain is initially created, the appropriate guests
>> **default** policy
>> is duplicated for use.  When auditing, Xen shall audit the toolstacks
>> requested policy against the guests **max** policy.  This allows
>> experimental
>> features or non-migration-safe features to be opted in to, without those
>> features being imposed upon all guests automatically.
>
>>
>> The `disable_migrate` field shall be dropped.  The concept of
>> migrateability
>> is not boolean; it is a large spectrum, all of which needs to be
>> managed by
>> the toolstack.
> Can't this large spectrum result in a bool which can then be used for
> disable_migrate. Sorry, I can't see any value add in removing
> disable_migrate.

A spectrum is by definition not a single boolean.  What is unclear about
my argument here that disable_migrate is unfit for purpose?

~Andrew

>  The simple case is picking the common subset of features
>> between the source and destination.  This becomes more complicated
>> e.g. if the
>> guest uses LBR/LER, at which point the toolstack needs to consider
>> hardware
>> with the same LBR/LER format in addition to just the plain features.
>>
>> `disable_migrate` is currently only used to expose ITSC to guests,
>> but there
>> are cases where is perfectly safe to migrate such a guest, if

Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-12 Thread Andrew Cooper
On 08/06/17 14:47, Jan Beulich wrote:
 On 08.06.17 at 15:12,  wrote:
>> # Proposal
>>
>> First and foremost, split the current **max\_policy** notion into separate
>> **max** and **default** policies.  This allows for the provision of features
>> which are unused by default, but may be opted in to, both at the hypervisor
>> level and the toolstack level.
>>
>> At the hypervisor level, **max** constitutes all the features Xen can use on
>> the current hardware, while **default** is the subset thereof which are
>> supported features, the features which the user has explicitly opted in to,
>> and excluding any features the user has explicitly opted out of.
>>
>> A new `cpuid=` command line option shall be introduced, whose internals are
>> generated automatically from the featureset ABI.  This means that all 
>> features
>> added to `include/public/arch-x86/cpufeatureset.h` automatically gain command
>> line control.  (RFC: The same top level option can probably be used for
>> non-feature CPUID data control, although I can't currently think of any cases
>> where this would be used Also find a sensible way to express 'available but
>> not to be used by Xen', as per the current `smep` and `smap` options.)
> Especially for disabling individual features I'm not sure "cpuid=" is
> an appropriate name. After all CPUID is only a manifestation of
> behavior elsewhere, and hence we don't really want CPUID
> behavior be controlled, but behavior which CPUID output reflects.
> I can't, however, think of an alternative name I would consider
> more suitable.

I suppose I view it a little like "information contained within cpuid"=

I'm happy to use an alternative name if we can think of a better one,
but I definitely want a way to control every feature (rather than the
controls being ad-hoc), and don't want to introduce top level booleans
for each feature.

>
>> At the guest level, **max** constitutes all the features which can be offered
>> to each type of guest on this hardware.  Derived from Xen's **default**
>> policy, it includes the supported features and explicitly opted in to
>> features, which are appropriate for the guest.
> There's no provision here at all for features which hardware doesn't
> offer, but which we can emulate in a reasonable way (UMIP being
> the example I'd be thinking of right away). While perhaps this could
> be viewed to be covered by "explicitly opted in to features", I think
> it would be nice to make this explicit.

In this case, I'd include that within "the features which can be offered".

So far, there is only a single feature we emulate to guests without
hardware support, which is x2apic mode for HVM guests.

I should call this distinction out more clearly.

>
>> The guests **default** policy is then derived from its **max**, and includes
>> the supported features which are considered migration safe.  (RFC: This
>> distinction is rather fuzzy, but for example it wouldn't include things like
>> ITSC by default, as that is likely to go wrong unless special care is 
>> taken.)
> As per above I think the delta between max and default is larger
> than just migration-unsafe pieces. Iirc for UMIP we would mean to
> have it off by default at least in the case where emulation incurs
> side effects.

There is a lot of emulation overhead for UMIP on non-UMIP-capable
hardware.  I'd advocate for it needing to be opt-in at both the
hypervisor and toolstack level.  In general, I'd expect people to be
more wary of the added emulation than the information leak.

>
>> The `disable_migrate` field shall be dropped.  The concept of migrateability
>> is not boolean; it is a large spectrum, all of which needs to be managed by
>> the toolstack.  The simple case is picking the common subset of features
>> between the source and destination.  This becomes more complicated e.g. if 
>> the
>> guest uses LBR/LER, at which point the toolstack needs to consider hardware
>> with the same LBR/LER format in addition to just the plain features.
> Not sure about this - by intercepting the MSR accesses to the involved
> MSRs, it would be possible to mimic the LBR/LER format expected by
> the guest even if different from that of the host.

LER yes, but how would you emulate LBR?

You could set DBG_CTL.BTF/EFLAGS.TF and intercept #DB, but this would be
visible to the guest via pushf/popf.  It would also interfere with a
guest trying to single-step itself.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-09 Thread Anshul Makkar

On 08/06/2017 14:12, Andrew Cooper wrote:

Presented herewith is the a plan for the final part of CPUID work, which
primarily covers better Xen/Toolstack interaction for configuring the guests
CPUID policy.

A PDF version of this document is available from:

http://xenbits.xen.org/people/andrewcoop/cpuid-part-3.pdf

There are a number of still-open questions, which I would appreaciate views
on.

~Andrew


# Proposal

First and foremost, split the current **max\_policy** notion into separate
**max** and **default** policies.  This allows for the provision of features
which are unused by default, but may be opted in to, both at the hypervisor
level and the toolstack level.

At the hypervisor level, **max** constitutes all the features Xen can use on
the current hardware, while **default** is the subset thereof which are
supported features, the features which the user has explicitly opted in to,
and excluding any features the user has explicitly opted out of.

A new `cpuid=` command line option shall be introduced, whose internals are
generated automatically from the featureset ABI.  This means that all features
added to `include/public/arch-x86/cpufeatureset.h` automatically gain command
line control.  (RFC: The same top level option can probably be used for
non-feature CPUID data control, although I can't currently think of any cases
where this would be used Also find a sensible way to express 'available but
not to be used by Xen', as per the current `smep` and `smap` options.)


At the guest level, **max** constitutes all the features which can be offered
to each type of guest on this hardware.  Derived from Xen's **default**
policy, it includes the supported features and explicitly opted in to
features, which are appropriate for the guest.

The guests **default** policy is then derived from its **max**, and includes
the supported features which are considered migration safe.  (RFC: This
distinction is rather fuzzy, but for example it wouldn't include things like
ITSC by default, as that is likely to go wrong unless special care is taken.)

Just from other perspective, what happens to the features which have 
been explicilty selected and are not migration safe ? Do, we consider 
them in guest's default policy.



All global policies (Xen and guest, max and default) shall be made available
to the toolstack, in a manner similar to the existing
Instead of all, do you see any harm if we expose only the default 
policies of Xen and Guest to toolstack.

_XEN\_SYSCTL\_get\_cpu\_featureset_ mechanism.  This allows decisions to be
taken which include all CPUID data, not just the feature bitmaps.

New _XEN\_DOMCTL\_{get,set}\_cpuid\_policy_ hypercalls will be introduced,
which allows the toolstack to query and set the cpuid policy for a specific
domain.  It shall supersede _XEN\_DOMCTL\_set\_cpuid_, shall fail if Xen is
unhappy with any aspect of the policy during auditing.

When a domain is initially created, the appropriate guests **default** policy
is duplicated for use.  When auditing, Xen shall audit the toolstacks
requested policy against the guests **max** policy.  This allows experimental
features or non-migration-safe features to be opted in to, without those
features being imposed upon all guests automatically.




The `disable_migrate` field shall be dropped.  The concept of migrateability
is not boolean; it is a large spectrum, all of which needs to be managed by
the toolstack.
Can't this large spectrum result in a bool which can then be used for 
disable_migrate. Sorry, I can't see any value add in removing 
disable_migrate.

 The simple case is picking the common subset of features

between the source and destination.  This becomes more complicated e.g. if the
guest uses LBR/LER, at which point the toolstack needs to consider hardware
with the same LBR/LER format in addition to just the plain features.

`disable_migrate` is currently only used to expose ITSC to guests, but there
are cases where is perfectly safe to migrate such a guest, if the destination
host has the same TSC frequency or hardware TSC scaling support.

Finally, `disable_migrate` doesn't (and cannot reasonably) be used to inhibit
state gather operations, as this interferes with debugging and monitoring
tasks.


Thanks
Anshul


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] DESIGN: CPUID part 3

2017-06-08 Thread Jan Beulich
>>> On 08.06.17 at 15:12,  wrote:
> # Proposal
> 
> First and foremost, split the current **max\_policy** notion into separate
> **max** and **default** policies.  This allows for the provision of features
> which are unused by default, but may be opted in to, both at the hypervisor
> level and the toolstack level.
> 
> At the hypervisor level, **max** constitutes all the features Xen can use on
> the current hardware, while **default** is the subset thereof which are
> supported features, the features which the user has explicitly opted in to,
> and excluding any features the user has explicitly opted out of.
> 
> A new `cpuid=` command line option shall be introduced, whose internals are
> generated automatically from the featureset ABI.  This means that all features
> added to `include/public/arch-x86/cpufeatureset.h` automatically gain command
> line control.  (RFC: The same top level option can probably be used for
> non-feature CPUID data control, although I can't currently think of any cases
> where this would be used Also find a sensible way to express 'available but
> not to be used by Xen', as per the current `smep` and `smap` options.)

Especially for disabling individual features I'm not sure "cpuid=" is
an appropriate name. After all CPUID is only a manifestation of
behavior elsewhere, and hence we don't really want CPUID
behavior be controlled, but behavior which CPUID output reflects.
I can't, however, think of an alternative name I would consider
more suitable.

> At the guest level, **max** constitutes all the features which can be offered
> to each type of guest on this hardware.  Derived from Xen's **default**
> policy, it includes the supported features and explicitly opted in to
> features, which are appropriate for the guest.

There's no provision here at all for features which hardware doesn't
offer, but which we can emulate in a reasonable way (UMIP being
the example I'd be thinking of right away). While perhaps this could
be viewed to be covered by "explicitly opted in to features", I think
it would be nice to make this explicit.

> The guests **default** policy is then derived from its **max**, and includes
> the supported features which are considered migration safe.  (RFC: This
> distinction is rather fuzzy, but for example it wouldn't include things like
> ITSC by default, as that is likely to go wrong unless special care is 
> taken.)

As per above I think the delta between max and default is larger
than just migration-unsafe pieces. Iirc for UMIP we would mean to
have it off by default at least in the case where emulation incurs
side effects.

> The `disable_migrate` field shall be dropped.  The concept of migrateability
> is not boolean; it is a large spectrum, all of which needs to be managed by
> the toolstack.  The simple case is picking the common subset of features
> between the source and destination.  This becomes more complicated e.g. if the
> guest uses LBR/LER, at which point the toolstack needs to consider hardware
> with the same LBR/LER format in addition to just the plain features.

Not sure about this - by intercepting the MSR accesses to the involved
MSRs, it would be possible to mimic the LBR/LER format expected by
the guest even if different from that of the host.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel