On 16/05/2023 8:58 am, Jan Beulich wrote:
> On 15.05.2023 16:42, Andrew Cooper wrote:
>> Right now, dom0's feature configuration is split between between the common
>> path and a dom0-specific one.  This mostly is by accident, and causes some
>> very subtle bugs.
>>
>> First, start by clearly defining init_dom0_cpuid_policy() to be the domain
>> that Xen builds automatically.  The late hwdom case is still constructed in a
>> mostly normal way, with the control domain having full discretion over the 
>> CPU
>> policy.
>>
>> Identifying this highlights a latent bug - the two halves of the 
>> MSR_ARCH_CAPS
>> bodge are asymmetric with respect to the hardware domain.  This means that
>> shim, or a control-only dom0 sees the MSR_ARCH_CAPS CPUID bit but none of the
>> MSR content.  This in turn declares the hardware to be retpoline-safe by
>> failing to advertise the {R,}RSBA bits appropriately.  Restrict this logic to
>> the hardware domain, although the special case will cease to exist shortly.
>>
>> For the CPUID Faulting adjustment, the comment in ctxt_switch_levelling()
>> isn't actually relevant.  Provide a better explanation.
>>
>> Move the recalculate_cpuid_policy() call outside of the dom0-cpuid= case.
>> This is no change for now, but will become necessary shortly.
>>
>> Finally, place the second half of the MSR_ARCH_CAPS bodge after the
>> recalculate_cpuid_policy() call.  This is necessary to avoid transiently
>> breaking the hardware domain's view while the handling is cleaned up.  This
>> special case will cease to exist shortly.
>>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> Reviewed-by: Jan Beulich <jbeul...@suse.com>

Thanks.

> with one question / suggestion:
>
>> @@ -858,7 +839,7 @@ void __init init_dom0_cpuid_policy(struct domain *d)
>>       * so dom0 can turn off workarounds as appropriate.  Temporary, until 
>> the
>>       * domain policy logic gains a better understanding of MSRs.
>>       */
>> -    if ( cpu_has_arch_caps )
>> +    if ( is_hardware_domain(d) && cpu_has_arch_caps )
>>          p->feat.arch_caps = true;
> As a result of this, ...
>
>> @@ -876,8 +857,32 @@ void __init init_dom0_cpuid_policy(struct domain *d)
>>          }
>>  
>>          x86_cpu_featureset_to_policy(fs, p);
>> +    }
>> +
>> +    /*
>> +     * PV Control domains used to require unfiltered CPUID.  This was fixed 
>> in
>> +     * Xen 4.13, but there is an cmdline knob to restore the prior 
>> behaviour.
>> +     *
>> +     * If the domain is getting unfiltered CPUID, don't let the guest kernel
>> +     * play with CPUID faulting either, as Xen's CPUID path won't cope.
>> +     */
>> +    if ( !opt_dom0_cpuid_faulting && is_control_domain(d) && 
>> is_pv_domain(d) )
>> +        p->platform_info.cpuid_faulting = false;
>>  
>> -        recalculate_cpuid_policy(d);
>> +    recalculate_cpuid_policy(d);
>> +
>> +    if ( is_hardware_domain(d) && cpu_has_arch_caps )
> ... it would feel slightly more logical if p->feat.arch_caps was used here.
> Whether that's to replace the entire condition or merely the right side of
> the && depends on what the subsequent changes require (which I haven't
> looked at yet).

I'd really prefer to leave it as-is.

You're likely right, but this entire block is deleted in patch 6 and it
has been a giant pain already making this series bisectable WRT all our
special cases.

For the sake of a few patches, it's safer to go with it in the
pre-existing form.

~Andrew

Reply via email to