On 05/06/2023 8:50 am, Jan Beulich wrote:
> On 05.06.2023 09:48, Jan Beulich wrote:
>> On 02.06.2023 15:57, Andrew Cooper wrote:
>>> On 02/06/2023 10:56 am, Jan Beulich wrote:
>>>> On 01.06.2023 16:48, Andrew Cooper wrote:
>>>>> @@ -593,15 +596,85 @@ static bool __init retpoline_calculations(void)
>>>>>          return false;
>>>>>  
>>>>>      /*
>>>>> -     * RSBA may be set by a hypervisor to indicate that we may move to a
>>>>> -     * processor which isn't retpoline-safe.
>>>>> +     * The meaning of the RSBA and RRSBA bits have evolved over time.  
>>>>> The
>>>>> +     * agreed upon meaning at the time of writing (May 2023) is thus:
>>>>> +     *
>>>>> +     * - RSBA (RSB Alternative) means that an RSB may fall back to an
>>>>> +     *   alternative predictor on underflow.  Skylake uarch and later 
>>>>> all have
>>>>> +     *   this property.  Broadwell too, when running microcode versions 
>>>>> prior
>>>>> +     *   to Jan 2018.
>>>>> +     *
>>>>> +     * - All eIBRS-capable processors suffer RSBA, but eIBRS also 
>>>>> introduces
>>>>> +     *   tagging of predictions with the mode in which they were 
>>>>> learned.  So
>>>>> +     *   when eIBRS is active, RSBA becomes RRSBA (Restricted RSBA).
>>>>> +     *
>>>>> +     * - CPUs are not expected to enumerate both RSBA and RRSBA.
>>>>> +     *
>>>>> +     * Some parts (Broadwell) are not expected to ever enumerate this
>>>>> +     * behaviour directly.  Other parts have differing enumeration with
>>>>> +     * microcode version.  Fix up Xen's idea, so we can advertise them 
>>>>> safely
>>>>> +     * to guests, and so toolstacks can level a VM safety for migration.
>>>>> +     *
>>>>> +     * The following states exist:
>>>>> +     *
>>>>> +     * |   | RSBA | EIBRS | RRSBA | Notes              | Action        |
>>>>> +     * |---+------+-------+-------+--------------------+---------------|
>>>>> +     * | 1 |    0 |     0 |     0 | OK (older parts)   | Maybe +RSBA   |
>>>>> +     * | 2 |    0 |     0 |     1 | Broken             | +RSBA, -RRSBA |
>>>>> +     * | 3 |    0 |     1 |     0 | OK (pre-Aug ucode) | +RRSBA        |
>>>>> +     * | 4 |    0 |     1 |     1 | OK                 |               |
>>>>> +     * | 5 |    1 |     0 |     0 | OK                 |               |
>>>>> +     * | 6 |    1 |     0 |     1 | Broken             | -RRSBA        |
>>>>> +     * | 7 |    1 |     1 |     0 | Broken             | -RSBA, +RRSBA |
>>>>> +     * | 8 |    1 |     1 |     1 | Broken             | -RSBA         |
>>>>>       *
>>>>> +     * However, we doesn't need perfect adherence to the spec.  Identify 
>>>>> the
>>>>> +     * broken cases (so we stand a chance of spotting violated 
>>>>> assumptions),
>>>>> +     * and fix up Rows 1 and 3 so Xen can use RSBA || RRSBA to identify
>>>>> +     * "alternative predictors potentially in use".
>>>> Considering that it's rows 2, 6, 7, and 8 which are broken, I find this
>>>> comment a little misleading. To me it doesn't become clear whether them
>>>> subsequently being left alone (and merely a log message issued) is
>>>> intentional.
>>> It is intentional.
>>>
>>> I don't know if these combinations exist in practice anywhere or not. 
>>> Intel think they oughtn't to, and it's quite possible that the printk()
>>> is unreachable, but given the complexity and shifting meanings over time
>>> here, I think it would be unwise to simply assume this to be true.
>> I agree.
> Thinking of it - would we perhaps want to go a step further an taint the
> system in such a case? I would then view this as kind of "Xen not
> (security) supported on this hardware." Until we manage to fix (or work
> around) the issue.

'S' for out-of-spec seems like it fits the bill.

In fact, that can also be used for the CET vs MSR_SPEC_CTRL cross-check
at the start of init_speculation_mitigations().

~Andrew

Reply via email to