Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Andrew Cooper
On 30/11/15 16:00, Jan Beulich wrote:
 On 30.11.15 at 16:38,  wrote:
>> On 30/11/15 15:22, Jan Beulich wrote:
>> On 30.11.15 at 14:36,  wrote:
 On 30/11/15 11:30, Jan Beulich wrote:
> It's not well defined whether YMM register presence
> correlates to AVX, or is simply flagged by the respective XSTATE
> CPUID bit (or a mixture of both).
 It is indeed not well defined, which is what makes this area of
 functionality so hard to level safely.

> The minimal (and imo more natural) dependency is just the XSTATE bit.
 But it is wrong.

 Any VEX encoded SIMD operation unconditionally works on YMM state.  In
 the case that XMM registers are encoded with a VEX prefix, the upper 128
 bits of the YMM register are zeroed (SDM Vol 2, 2.3.10).  This is
 contrary to legacy SSE instructions which preserve the upper 128 bits.

 Therefore, FMA, FMA4 and XOP do have a strict dependency on AVX.
>>> No, if you really want to express it that way, you'll need feature
>>> flags derived from the XSTATE bits.
>> What? That is absurd.
> Sorry, but no, this is not absurd, this is what you can derive from the
> SDM without much guessing. There's nowhere the SDM makes any
> connection between FMA and AVX.

Intel Vol 1 14.5.3 "Detection of FMA" states:

Hardware support for FMA is indicated by CPUID.1:ECX.FMA[bit 12]=1.
Application Software must identify that hardware supports AVX, after
that it must also detect support for FMA by
CPUID.1:ECX.FMA[bit 12].


> The only connections it makes are OSXSAVE and XCR0[2:1], neither of which is 
> formally tied to AVX.

Actually, on further reading,

Intel SDM Vol 3, 2.6, Figure 2-8 states:

XCR0.AVX (bit 2): If 1, AVX instructions can be executed and the XSAVE
feature set can be used to manage the
upper halves of the YMM registers (YMM0-YMM15 in 64-bit mode; otherwise
YMM0-YMM7).

This means that bit 2 has dual meaning, and is not just YMM state.  This
does IMO provide a formal tie between AVX and XCRO[2].


I admit that the AMD manuals are far less prescriptive than the Intel. 
However, AMD Vol 3 1.9 "Encoding using the VEX and XOP Prefixes" draws
several conclusions, including:

VEX opcode maps 1–3 are also used to encode the FMA4 and FMA instructions

while the FMA/FMA4 instruction description states:

The destination is either an XMM register or a YMM register, as
determined by VEX.L. When the
destination is an XMM register (L = 0), bits [255:128] of the
corresponding YMM register are
cleared.

and also states that a #UD will occur if XCR0[2:1] != '11b', which is
sufficient indication of FMA/FMA4 having a direct link to AVX.


As for XOP, AMD Vol 4, 1.2.2.1 "XMM Register Destinations" states again
that either all YMM is specified, or the the upper 128 bits are cleared
if an XMM register is encoded, as well as each instruction description
specifying a #UD if XCR0[2:1] != '11b'.  This logically follows from the
history, where XOP ended up being all the SSE5 instructions which didn't
overlap with AVX.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Jan Beulich
>>> On 30.11.15 at 16:38,  wrote:
> On 30/11/15 15:22, Jan Beulich wrote:
> On 30.11.15 at 14:36,  wrote:
>>> On 30/11/15 11:30, Jan Beulich wrote:
 It's not well defined whether YMM register presence
 correlates to AVX, or is simply flagged by the respective XSTATE
 CPUID bit (or a mixture of both).
>>> It is indeed not well defined, which is what makes this area of
>>> functionality so hard to level safely.
>>>
 The minimal (and imo more natural) dependency is just the XSTATE bit.
>>> But it is wrong.
>>>
>>> Any VEX encoded SIMD operation unconditionally works on YMM state.  In
>>> the case that XMM registers are encoded with a VEX prefix, the upper 128
>>> bits of the YMM register are zeroed (SDM Vol 2, 2.3.10).  This is
>>> contrary to legacy SSE instructions which preserve the upper 128 bits.
>>>
>>> Therefore, FMA, FMA4 and XOP do have a strict dependency on AVX.
>> No, if you really want to express it that way, you'll need feature
>> flags derived from the XSTATE bits.
> 
> What? That is absurd.

Sorry, but no, this is not absurd, this is what you can derive from the
SDM without much guessing. There's nowhere the SDM makes any
connection between FMA and AVX. The only connections it makes
are OSXSAVE and XCR0[2:1], neither of which is formally tied to AVX.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Andrew Cooper
On 30/11/15 15:22, Jan Beulich wrote:
 On 30.11.15 at 14:36,  wrote:
>> On 30/11/15 11:30, Jan Beulich wrote:
>>> It's not well defined whether YMM register presence
>>> correlates to AVX, or is simply flagged by the respective XSTATE
>>> CPUID bit (or a mixture of both).
>> It is indeed not well defined, which is what makes this area of
>> functionality so hard to level safely.
>>
>>> The minimal (and imo more natural) dependency is just the XSTATE bit.
>> But it is wrong.
>>
>> Any VEX encoded SIMD operation unconditionally works on YMM state.  In
>> the case that XMM registers are encoded with a VEX prefix, the upper 128
>> bits of the YMM register are zeroed (SDM Vol 2, 2.3.10).  This is
>> contrary to legacy SSE instructions which preserve the upper 128 bits.
>>
>> Therefore, FMA, FMA4 and XOP do have a strict dependency on AVX.
> No, if you really want to express it that way, you'll need feature
> flags derived from the XSTATE bits.

What? That is absurd.

They depend on AVX.  I am not introducing a synthetic feature when the
real feature bits are correct.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Jan Beulich
>>> On 30.11.15 at 14:36,  wrote:
> On 30/11/15 11:30, Jan Beulich wrote:
>> It's not well defined whether YMM register presence
>> correlates to AVX, or is simply flagged by the respective XSTATE
>> CPUID bit (or a mixture of both).
> 
> It is indeed not well defined, which is what makes this area of
> functionality so hard to level safely.
> 
>> The minimal (and imo more natural) dependency is just the XSTATE bit.
> 
> But it is wrong.
> 
> Any VEX encoded SIMD operation unconditionally works on YMM state.  In
> the case that XMM registers are encoded with a VEX prefix, the upper 128
> bits of the YMM register are zeroed (SDM Vol 2, 2.3.10).  This is
> contrary to legacy SSE instructions which preserve the upper 128 bits.
> 
> Therefore, FMA, FMA4 and XOP do have a strict dependency on AVX.

No, if you really want to express it that way, you'll need feature
flags derived from the XSTATE bits.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Andrew Cooper
On 30/11/15 11:30, Jan Beulich wrote:
 On 30.11.15 at 12:10,  wrote:
>> On 30/11/15 11:08, Jan Beulich wrote:
>> On 30.11.15 at 11:46,  wrote:
 On 30/11/15 10:01, Jan Beulich wrote:
 On 27.11.15 at 16:05,  wrote:
>> On 27/11/15 11:05, Jan Beulich wrote:
>>> ... or when the guest has the XSAVE feature hidden by CPUID policy.
>>> Not doing so is at best confusing to guests.
>>>
>>> Signed-off-by: Jan Beulich 
>> These changes here are an improvement (so I don't object to taking them
>> ahead of my fullblown levelling series), but they are incomplete.
>>
>> xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
>> fma, fma4, f16c, avx2 and xop depend on avx.
> I think the dependencies here are a little fuzzy, and hence I'd
> prefer us to not enforce more strict rules than are truly necessary:
>
> FMA: Neither Intel's SDM nor AMD's PM state any dependency on AVX.
>
> FMA4, XOP: AMD's PM doesn't state any dependency on AVX.
>
> AVX2: While Intel's SDM doesn't say so here either, I agree in this case.
>
> I.e. my view is that FMA{,4} and XOP are all pretty much
> independent of AVX, and they e.g. imply by themselves presence of
> YMM registers.
 The AVX feature means several things, and in this case support for VEX
 encoded instructions.
>>> Yet I don't think "VEX encoding" == "AVX". See the various VEX-
>>> encoded non-SIMD instructions.
>>>
 Per SDM Vol 2, Table 2-18, any VEX encoded instruction will
 unconditionally #UD fault if XCR0[2:1] != '11b' or CR0.OSXSAVE = 0.
>>> I.e. a dependency on XSAVE, but not on AVX.
>> XCR0[2:1] are the AVX and SSE bits.
> You mean the YMM and XMM ones (the latter one is  mis-named in
> our code base).

True - I will put it on my TODO list when making XSAVE state safe for
migration (which requires other changes in this area).

> It's not well defined whether YMM register presence
> correlates to AVX, or is simply flagged by the respective XSTATE
> CPUID bit (or a mixture of both).

It is indeed not well defined, which is what makes this area of
functionality so hard to level safely.

> The minimal (and imo more natural) dependency is just the XSTATE bit.

But it is wrong.

Any VEX encoded SIMD operation unconditionally works on YMM state.  In
the case that XMM registers are encoded with a VEX prefix, the upper 128
bits of the YMM register are zeroed (SDM Vol 2, 2.3.10).  This is
contrary to legacy SSE instructions which preserve the upper 128 bits.

Therefore, FMA, FMA4 and XOP do have a strict dependency on AVX.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Jan Beulich
>>> On 30.11.15 at 12:10,  wrote:
> On 30/11/15 11:08, Jan Beulich wrote:
> On 30.11.15 at 11:46,  wrote:
>>> On 30/11/15 10:01, Jan Beulich wrote:
>>> On 27.11.15 at 16:05,  wrote:
> On 27/11/15 11:05, Jan Beulich wrote:
>> ... or when the guest has the XSAVE feature hidden by CPUID policy.
>> Not doing so is at best confusing to guests.
>>
>> Signed-off-by: Jan Beulich 
> These changes here are an improvement (so I don't object to taking them
> ahead of my fullblown levelling series), but they are incomplete.
>
> xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
> fma, fma4, f16c, avx2 and xop depend on avx.
 I think the dependencies here are a little fuzzy, and hence I'd
 prefer us to not enforce more strict rules than are truly necessary:

 FMA: Neither Intel's SDM nor AMD's PM state any dependency on AVX.

 FMA4, XOP: AMD's PM doesn't state any dependency on AVX.

 AVX2: While Intel's SDM doesn't say so here either, I agree in this case.

 I.e. my view is that FMA{,4} and XOP are all pretty much
 independent of AVX, and they e.g. imply by themselves presence of
 YMM registers.
>>> The AVX feature means several things, and in this case support for VEX
>>> encoded instructions.
>> Yet I don't think "VEX encoding" == "AVX". See the various VEX-
>> encoded non-SIMD instructions.
>>
>>> Per SDM Vol 2, Table 2-18, any VEX encoded instruction will
>>> unconditionally #UD fault if XCR0[2:1] != '11b' or CR0.OSXSAVE = 0.
>> I.e. a dependency on XSAVE, but not on AVX.
> 
> XCR0[2:1] are the AVX and SSE bits.

You mean the YMM and XMM ones (the latter one is  mis-named in
our code base). It's not well defined whether YMM register presence
correlates to AVX, or is simply flagged by the respective XSTATE
CPUID bit (or a mixture of both). The minimal (and imo more natural)
dependency is just the XSTATE bit.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Andrew Cooper
On 30/11/15 11:08, Jan Beulich wrote:
 On 30.11.15 at 11:46,  wrote:
>> On 30/11/15 10:01, Jan Beulich wrote:
>> On 27.11.15 at 16:05,  wrote:
 On 27/11/15 11:05, Jan Beulich wrote:
> ... or when the guest has the XSAVE feature hidden by CPUID policy.
> Not doing so is at best confusing to guests.
>
> Signed-off-by: Jan Beulich 
 These changes here are an improvement (so I don't object to taking them
 ahead of my fullblown levelling series), but they are incomplete.

 xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
 fma, fma4, f16c, avx2 and xop depend on avx.
>>> I think the dependencies here are a little fuzzy, and hence I'd
>>> prefer us to not enforce more strict rules than are truly necessary:
>>>
>>> FMA: Neither Intel's SDM nor AMD's PM state any dependency on AVX.
>>>
>>> FMA4, XOP: AMD's PM doesn't state any dependency on AVX.
>>>
>>> AVX2: While Intel's SDM doesn't say so here either, I agree in this case.
>>>
>>> I.e. my view is that FMA{,4} and XOP are all pretty much
>>> independent of AVX, and they e.g. imply by themselves presence of
>>> YMM registers.
>> The AVX feature means several things, and in this case support for VEX
>> encoded instructions.
> Yet I don't think "VEX encoding" == "AVX". See the various VEX-
> encoded non-SIMD instructions.
>
>> Per SDM Vol 2, Table 2-18, any VEX encoded instruction will
>> unconditionally #UD fault if XCR0[2:1] != '11b' or CR0.OSXSAVE = 0.
> I.e. a dependency on XSAVE, but not on AVX.

XCR0[2:1] are the AVX and SSE bits.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Jan Beulich
>>> On 30.11.15 at 11:46,  wrote:
> On 30/11/15 10:01, Jan Beulich wrote:
> On 27.11.15 at 16:05,  wrote:
>>> On 27/11/15 11:05, Jan Beulich wrote:
 ... or when the guest has the XSAVE feature hidden by CPUID policy.
 Not doing so is at best confusing to guests.

 Signed-off-by: Jan Beulich 
>>> These changes here are an improvement (so I don't object to taking them
>>> ahead of my fullblown levelling series), but they are incomplete.
>>>
>>> xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
>>> fma, fma4, f16c, avx2 and xop depend on avx.
>> I think the dependencies here are a little fuzzy, and hence I'd
>> prefer us to not enforce more strict rules than are truly necessary:
>>
>> FMA: Neither Intel's SDM nor AMD's PM state any dependency on AVX.
>>
>> FMA4, XOP: AMD's PM doesn't state any dependency on AVX.
>>
>> AVX2: While Intel's SDM doesn't say so here either, I agree in this case.
>>
>> I.e. my view is that FMA{,4} and XOP are all pretty much
>> independent of AVX, and they e.g. imply by themselves presence of
>> YMM registers.
> 
> The AVX feature means several things, and in this case support for VEX
> encoded instructions.

Yet I don't think "VEX encoding" == "AVX". See the various VEX-
encoded non-SIMD instructions.

> Per SDM Vol 2, Table 2-18, any VEX encoded instruction will
> unconditionally #UD fault if XCR0[2:1] != '11b' or CR0.OSXSAVE = 0.

I.e. a dependency on XSAVE, but not on AVX.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Andrew Cooper
On 30/11/15 10:01, Jan Beulich wrote:
 On 27.11.15 at 16:05,  wrote:
>> On 27/11/15 11:05, Jan Beulich wrote:
>>> ... or when the guest has the XSAVE feature hidden by CPUID policy.
>>> Not doing so is at best confusing to guests.
>>>
>>> Signed-off-by: Jan Beulich 
>> These changes here are an improvement (so I don't object to taking them
>> ahead of my fullblown levelling series), but they are incomplete.
>>
>> xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
>> fma, fma4, f16c, avx2 and xop depend on avx.
> I think the dependencies here are a little fuzzy, and hence I'd
> prefer us to not enforce more strict rules than are truly necessary:
>
> FMA: Neither Intel's SDM nor AMD's PM state any dependency on AVX.
>
> FMA4, XOP: AMD's PM doesn't state any dependency on AVX.
>
> AVX2: While Intel's SDM doesn't say so here either, I agree in this case.
>
> I.e. my view is that FMA{,4} and XOP are all pretty much
> independent of AVX, and they e.g. imply by themselves presence of
> YMM registers.

The AVX feature means several things, and in this case support for VEX
encoded instructions.

Per SDM Vol 2, Table 2-18, any VEX encoded instruction will
unconditionally #UD fault if XCR0[2:1] != '11b' or CR0.OSXSAVE = 0.

FMA, FMA4 and XOP may only be VEX encoded, so do explicitly depend on
AVX support.

(Both the Intel and the AMD manuals are poor at explicitly listing
dependencies.  I have spent far too much time attempting to reverse
engineer the real dependency trees from the information provided.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-30 Thread Jan Beulich
>>> On 27.11.15 at 16:05,  wrote:
> On 27/11/15 11:05, Jan Beulich wrote:
>> ... or when the guest has the XSAVE feature hidden by CPUID policy.
>> Not doing so is at best confusing to guests.
>>
>> Signed-off-by: Jan Beulich 
> 
> These changes here are an improvement (so I don't object to taking them
> ahead of my fullblown levelling series), but they are incomplete.
> 
> xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
> fma, fma4, f16c, avx2 and xop depend on avx.

I think the dependencies here are a little fuzzy, and hence I'd
prefer us to not enforce more strict rules than are truly necessary:

FMA: Neither Intel's SDM nor AMD's PM state any dependency on AVX.

FMA4, XOP: AMD's PM doesn't state any dependency on AVX.

AVX2: While Intel's SDM doesn't say so here either, I agree in this case.

I.e. my view is that FMA{,4} and XOP are all pretty much
independent of AVX, and they e.g. imply by themselves presence of
YMM registers.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-27 Thread Andrew Cooper
On 27/11/15 11:05, Jan Beulich wrote:
> ... or when the guest has the XSAVE feature hidden by CPUID policy.
> Not doing so is at best confusing to guests.
>
> Signed-off-by: Jan Beulich 

These changes here are an improvement (so I don't object to taking them
ahead of my fullblown levelling series), but they are incomplete.

xsaveopt, xsavec, xsetbv1, xsaves, avx and mpx depend on xsave.
fma, fma4, f16c, avx2 and xop depend on avx.

My levelling series actually introduces a dependency tree to properly
evaluate the knockon effects of disabling certain features.  I am unsure
whether it is worth your time to split xsave and avx here, but I
certainly wouldn't recommend making it any finer-grained at this stage.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

2015-11-27 Thread Jan Beulich
... or when the guest has the XSAVE feature hidden by CPUID policy.
Not doing so is at best confusing to guests.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -827,18 +827,22 @@ void pv_cpuid(struct cpu_user_regs *regs
 uint32_t a, b, c, d;
 struct vcpu *curr = current;
 struct domain *currd = curr->domain;
+bool_t guest_has_xsave = cpu_has_xsave;
 
 a = regs->eax;
-b = regs->ebx;
 c = regs->ecx;
-d = regs->edx;
 
 if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
 {
 unsigned int cpuid_leaf = a, sub_leaf = c;
 
-if ( !cpuid_hypervisor_leaves(a, c, &a, &b, &c, &d) )
-domain_cpuid(currd, a, c, &a, &b, &c, &d);
+if ( guest_has_xsave )
+{
+domain_cpuid(currd, 1, 0, &d, &d, &c, &d);
+if ( !(c & cpufeat_mask(X86_FEATURE_XSAVE)) )
+guest_has_xsave = 0;
+}
+domain_cpuid(currd, a, sub_leaf, &a, &b, &c, &d);
 
 switch ( cpuid_leaf )
 {
@@ -860,13 +864,12 @@ void pv_cpuid(struct cpu_user_regs *regs
 b = _eax + _ebx;
 }
 }
-goto xstate;
+break;
 }
 }
-goto out;
 }
-
-cpuid_count(a, c, &a, &b, &c, &d);
+else
+cpuid_count(a, c, &a, &b, &c, &d);
 
 if ( (regs->eax & 0x7fff) == 0x0001 )
 {
@@ -907,11 +910,11 @@ void pv_cpuid(struct cpu_user_regs *regs
 __clear_bit(X86_FEATURE_PDCM % 32, &c);
 __clear_bit(X86_FEATURE_PCID % 32, &c);
 __clear_bit(X86_FEATURE_DCA % 32, &c);
-if ( !cpu_has_xsave )
-{
-__clear_bit(X86_FEATURE_XSAVE % 32, &c);
-__clear_bit(X86_FEATURE_AVX % 32, &c);
-}
+if ( !guest_has_xsave )
+c &= ~(cpufeat_mask(X86_FEATURE_XSAVE) |
+   cpufeat_mask(X86_FEATURE_OSXSAVE) |
+   cpufeat_mask(X86_FEATURE_AVX) |
+   cpufeat_mask(X86_FEATURE_F16C));
 if ( !cpu_has_apic )
__clear_bit(X86_FEATURE_X2APIC % 32, &c);
 __set_bit(X86_FEATURE_HYPERVISOR % 32, &c);
@@ -921,7 +924,7 @@ void pv_cpuid(struct cpu_user_regs *regs
 if ( regs->_ecx == 0 )
 b &= (cpufeat_mask(X86_FEATURE_BMI1) |
   cpufeat_mask(X86_FEATURE_HLE)  |
-  cpufeat_mask(X86_FEATURE_AVX2) |
+  (guest_has_xsave ? cpufeat_mask(X86_FEATURE_AVX2) : 0) |
   cpufeat_mask(X86_FEATURE_BMI2) |
   cpufeat_mask(X86_FEATURE_ERMS) |
   cpufeat_mask(X86_FEATURE_RTM)  |
@@ -934,8 +937,7 @@ void pv_cpuid(struct cpu_user_regs *regs
 break;
 
 case XSTATE_CPUID:
-xstate:
-if ( !cpu_has_xsave )
+if ( !guest_has_xsave )
 goto unsupported;
 if ( regs->_ecx == 1 )
 {
@@ -966,6 +968,9 @@ void pv_cpuid(struct cpu_user_regs *regs
 __clear_bit(X86_FEATURE_SKINIT % 32, &c);
 __clear_bit(X86_FEATURE_WDT % 32, &c);
 __clear_bit(X86_FEATURE_LWP % 32, &c);
+if ( !guest_has_xsave )
+c &= ~(cpufeat_mask(X86_FEATURE_XOP) |
+   cpufeat_mask(X86_FEATURE_FMA4));
 __clear_bit(X86_FEATURE_NODEID_MSR % 32, &c);
 __clear_bit(X86_FEATURE_TOPOEXT % 32, &c);
 __clear_bit(X86_FEATURE_MWAITX % 32, &c);
@@ -989,7 +994,6 @@ void pv_cpuid(struct cpu_user_regs *regs
 break;
 }
 
- out:
 /* VPMU may decide to modify some of the leaves */
 vpmu_do_cpuid(regs->eax, &a, &b, &c, &d);
 



x86/PV: hide features dependent on XSAVE when booted with "no-xsave"

... or when the guest has the XSAVE feature hidden by CPUID policy.
Not doing so is at best confusing to guests.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -827,18 +827,22 @@ void pv_cpuid(struct cpu_user_regs *regs
 uint32_t a, b, c, d;
 struct vcpu *curr = current;
 struct domain *currd = curr->domain;
+bool_t guest_has_xsave = cpu_has_xsave;
 
 a = regs->eax;
-b = regs->ebx;
 c = regs->ecx;
-d = regs->edx;
 
 if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
 {
 unsigned int cpuid_leaf = a, sub_leaf = c;
 
-if ( !cpuid_hypervisor_leaves(a, c, &a, &b, &c, &d) )
-domain_cpuid(currd, a, c, &a, &b, &c, &d);
+if ( guest_has_xsave )
+{
+domain_cpuid(currd, 1, 0, &d, &d, &c, &d);
+if ( !(c & cpufeat_mask(X86_FEATURE_XSAVE)) )
+guest_has_xsave = 0;
+}
+domain_cpuid(currd, a, sub_leaf, &a, &b, &c, &d);
 
 switch ( cpuid_leaf )
 {
@@ -860,13 +864,12 @@ void pv_cpuid(struct cpu_user_regs *regs
 b = _eax + _ebx;
 }
 }
-goto xstate;
+break;
 }
 }
-goto out;
 }
-
-cpuid_cou