On 10/07/18 12:21, Richard Biener wrote:
> On Tue, Jul 10, 2018 at 12:53 PM Richard Earnshaw (lists)
> <richard.earns...@arm.com> wrote:
>>
>> On 10/07/18 11:10, Richard Biener wrote:
>>> On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
>>> <richard.earns...@arm.com> wrote:
>>>>
>>>> On 10/07/18 08:19, Richard Biener wrote:
>>>>> On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
>>>>> <richard.earns...@arm.com> wrote:
>>>>>>
>>>>>>
>>>>>> The patches I posted earlier this year for mitigating against
>>>>>> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>>>>>> which it became obvious that a rethink was needed.  This mail, and the
>>>>>> following patches attempt to address that feedback and present a new
>>>>>> approach to mitigating against this form of attack surface.
>>>>>>
>>>>>> There were two major issues with the original approach:
>>>>>>
>>>>>> - The speculation bounds were too tightly constrained - essentially
>>>>>>   they had to represent and upper and lower bound on a pointer, or a
>>>>>>   pointer offset.
>>>>>> - The speculation constraints could only cover the immediately preceding
>>>>>>   branch, which often did not fit well with the structure of the existing
>>>>>>   code.
>>>>>>
>>>>>> An additional criticism was that the shape of the intrinsic did not
>>>>>> fit particularly well with systems that used a single speculation
>>>>>> barrier that essentially had to wait until all preceding speculation
>>>>>> had to be resolved.
>>>>>>
>>>>>> To address all of the above, these patches adopt a new approach, based
>>>>>> in part on a posting by Chandler Carruth to the LLVM developers list
>>>>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>>>>>> but which we have extended to deal with inter-function speculation.
>>>>>> The patches divide the problem into two halves.
>>>>>>
>>>>>> The first half is some target-specific code to track the speculation
>>>>>> condition through the generated code to provide an internal variable
>>>>>> which can tell us whether or not the CPU's control flow speculation
>>>>>> matches the data flow calculations.  The idea is that the internal
>>>>>> variable starts with the value TRUE and if the CPU's control flow
>>>>>> speculation ever causes a jump to the wrong block of code the variable
>>>>>> becomes false until such time as the incorrect control flow
>>>>>> speculation gets unwound.
>>>>>>
>>>>>> The second half is that a new intrinsic function is introduced that is
>>>>>> much simpler than we had before.  The basic version of the intrinsic
>>>>>> is now simply:
>>>>>>
>>>>>>       T var = __builtin_speculation_safe_value (T unsafe_var);
>>>>>>
>>>>>> Full details of the syntax can be found in the documentation patch, in
>>>>>> patch 1.  In summary, when not speculating the intrinsic returns
>>>>>> unsafe_var; when speculating then if it can be shown that the
>>>>>> speculative flow has diverged from the intended control flow then zero
>>>>>> is returned.  An optional second argument can be used to return an
>>>>>> alternative value to zero.  The builtin may cause execution to pause
>>>>>> until the speculation state is resolved.
>>>>>
>>>>> So a trivial target implementation would be to emit a barrier and then
>>>>> it would always return unsafe_var and never zero.  What I don't understand
>>>>> fully is what users should do here, thus what the value of ever returning
>>>>> "unsafe" is.  Also I wonder why the API is forcing you to single-out a
>>>>> special value instead of doing
>>>>>
>>>>>  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
>>>>>  if (!safe)
>>>>>    /* what now? */
>>>>>
>>>>> I'm only guessing that the correct way to handle "unsafe" is basically
>>>>>
>>>>>  while (__builtin_speculation_safe_value (val) == 0)
>>>>>     ;
>>>>>
>>>>> use val, it's now safe
>>>>
>>>> No, a safe version of val is returned, not a bool telling you it is now
>>>> safe to use the original.
>>>
>>> OK, so making the old value dead is required to preserve the desired
>>> dataflow.
>>>
>>> But how should I use the special value that signaled "failure"?
>>>
>>> Obviously the user isn't supposed to simply replace 'val' with
>>>
>>>  val = __builtin_speculation_safe_value (val);
>>>
>>> to make it speculation-proof.  So - how should the user _use_ this
>>> builtin?  The docs do not say anything about this but says the
>>> very confusing
>>>
>>> +The function may use target-dependent speculation tracking state to cause
>>> +@var{failval} to be returned when it is known that speculative
>>> +execution has incorrectly predicted a conditional branch operation.
>>>
>>> because speculation is about executing instructions as if they were
>>> supposed to be executed.  Once it is known the prediciton was wrong
>>> no more "wrong" instructions will be executed but a previously
>>> speculated instruction cannot know it was "falsely" speculated.
>>>
>>> Does the above try to say that the function may return failval if the
>>> instruction is currently executed speculatively instead?  That would
>>> make sense to me.  And return failval independent of if the speculation
>>> later turns out to be correct or not.
>>>
>>>>  You must use the sanitized version in future,
>>>> not the unprotected version.
>>>>
>>>>
>>>> So the usage is going to be more like:
>>>>
>>>> val = __builtin_speculation_safe_value (val);  // Overwrite val with a
>>>> sanitized version.
>>>>
>>>> You have to use the cleaned up version, the unclean version is still
>>>> vulnerable to incorrect speculation.
>>>
>>> But then where does failval come into play?  The above cannot be correct
>>> if failval matters.
>>
>> failval only comes into play if the CPU is speculating incorrectly.
>> It's just a safe value that some CPUs might use until such time as the
>> speculation gets unwound.  In most cases it can be zero.  Perhaps a more
>> concrete example would be something like.
>>
>>
>> void **mem;
>>
>> void* f(unsigned untrusted)
>> {
>>   if (untrusted < 100)
>>     return mem[untrusted];
>>   return NULL;
>> }
>>
>> This can be rewritten as either:
>>
>> void *mem;
>>
>> void* f(unsigned untrusted)
>> {
>>   if (untrusted < 100)
>>     return mem[__builtin_speculation_safe_value (untrusted)];
>>   return NULL;
>> }
>>
>> if leaking mem[0] is not a problem; or if you're really paranoid:
>>
>> void* f(unsigned untrusted)
>> {
>>   if (untrusted < 100)
>>     return *__builtin_speculation_safe_value (mem + untrusted);
>>   return NULL;
>> }
>>
>> which becomes a NULL pointer dereference if the speculation is
>> incorrect.  The programmer doesn't need to test this code (any test
>> would likely be useless anyway as the CPU would predict the outcome and
>> speculate based on such a prediction).  They do need to think a bit
>> about what might leak if the CPU does miss-speculate, hence the two
>> examples above.
>>
>> A more complex case, where you might want a different failure value can
>> come up if you have a mask operation.  A slightly convoluted example
>> (derived from a real example we found in the Linux kernel) might be
>>
>> if (mode32)
>>   mask_from = 0x100000000;
>> else
>>   mask_from = 0;
>>
>> ... // some code
>>
>> return mem[untrusted_val & (mask_from - 1)];
>>
>> It would probably be better to place the speculation cleanup nearer the
>> initialization of mask_from, especially if mask_from could be used more
>> than once; but in that case 0 is less safe than any other (since 0-1 is
>> all bits 1).  In this case you might write:
>>
>> if (mode32)
>>   mask_from = 0x100000000;
>> else
>>   mask_from = 0;
>>
>> mask_from = __builtin_speculation_safe_value (mask_from, 1);
>>
>> ... // some code
>>
>> return mem[untrusted_val & (mask_from - 1)];
>>
>> And in this case if the speculation is incorrect the (mask_from - 1)
>> expression becomes 0 and the whole range is protected.
>>
>> Note that speculation is fine if it is correct, we don't want to hold
>> things up in that case since it will happen anyway.  It's only a problem
>> if the speculation is incorrect :-)
> 
> Ok, please amend at least the user-level documentation with one or two
> examples like above.
> 
>>
>>>
>>> Does the builtin try to say that iff the current instruction is speculated
>>> then it will return failval, otherwise it will return val and thus following
>>> code needs to handle 'failval' in a way that is safe?
>>
>> No.  From an abstract machine sense the builtin only ever returns its
>> first argument.  The fail value only appears if the CPU guesses the
>> direction of a branch and guesses *incorrectly* (incorrect speculation)
> 
> I understand how it works when the implementation issues a fence.
> How does the actual implementation (assembly) look like that makes
> the CPU know whether it speculated wrongly or not at the time the
> value is needed for further speculation given the *__bsfs (mem + offset)
> instruction _is_ speculated as well.
> 

If you've not already read it, then I suggest you read
https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/download-the-whitepaper
and the link I posted in the original mail to Chandler's posting on this
subject.

In essence, we start with a register (tracker) initialized to ~0.  At
every conditional branch we change

        b.cond tgt
        ...
tgt:
        ...

into

        b.cond tgt
        // tracker == ~cond ? tracker, 0
        csel    tracker, tracker, 0, ~cond
        ...
tgt:
        // tracker == cond ? tracker, 0
        csel    tracker, tracker, 0, cond


Now when we need to inhibit speculation we emit

        and     safe_val, unsafe_val, tracker
        CSDB

Assuming we want the default result of 0 if speculating unsafely.  CSDB
is a light-weight barrier that ensures that all preceding
/data-processing/ instructions are using real results from earlier
instructions: it will resolve any data-flow speculation, but not
necessarily any control-flow speculation.

Obviously the inserted instructions only appear on the edges between the
blocks, but we take care of that during the insertion process.

R.

> Richard.
> 
>>>
>>> Again, the wording "when it is known that speculative execution has
>>> incorrectly predicted a conditional branch operation" is very confusing...
>>>
>>> Richard.
>>>
>>>> R.
>>>>
>>>>>
>>>>> that is, the return value is only interesting in sofar as to whether it 
>>>>> is equal
>>>>> to val or the special value?
>>>>>
>>>>> That said, I wonder why we don't hide that details from the user and
>>>>> provide a predicate instead.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> There are seven patches in this set, as follows.
>>>>>>
>>>>>> 1) Introduces the new intrinsic __builtin_sepculation_safe_value.
>>>>>> 2) Adds a basic hard barrier implementation for AArch32 (arm) state.
>>>>>> 3) Adds a basic hard barrier implementation for AArch64 state.
>>>>>> 4) Adds a new command-line option -mtrack-speculation (currently a 
>>>>>> no-op).
>>>>>> 5) Disables CB[N]Z and TB[N]Z when -mtrack-speculation.
>>>>>> 6) Adds the new speculation tracking pass for AArch64
>>>>>> 7) Uses the new speculation tracking pass to generate CSDB-based barrier
>>>>>>    sequences
>>>>>>
>>>>>> I haven't added a speculation-tracking pass for AArch32 at this time.
>>>>>> It is possible to do this, but would require quite a lot of rework for
>>>>>> the arm backend due to the limited number of registers that are
>>>>>> available.
>>>>>>
>>>>>> Although patch 6 is AArch64 specific, I'd appreciate a review from
>>>>>> someone more familiar with the branch edge code than myself.  There
>>>>>> appear to be a number of tricky issues with more complex edges so I'd
>>>>>> like a second opinion on that code in case I've missed an important
>>>>>> case.
>>>>>>
>>>>>> R.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Richard Earnshaw (7):
>>>>>>   Add __builtin_speculation_safe_value
>>>>>>   Arm - add speculation_barrier pattern
>>>>>>   AArch64 - add speculation barrier
>>>>>>   AArch64 - Add new option -mtrack-speculation
>>>>>>   AArch64 - disable CB[N]Z TB[N]Z when tracking speculation
>>>>>>   AArch64 - new pass to add conditional-branch speculation tracking
>>>>>>   AArch64 - use CSDB based sequences if speculation tracking is enabled
>>>>>>
>>>>>>  gcc/builtin-types.def                     |   6 +
>>>>>>  gcc/builtins.c                            |  57 ++++
>>>>>>  gcc/builtins.def                          |  20 ++
>>>>>>  gcc/c-family/c-common.c                   | 143 +++++++++
>>>>>>  gcc/c-family/c-cppbuiltin.c               |   5 +-
>>>>>>  gcc/config.gcc                            |   2 +-
>>>>>>  gcc/config/aarch64/aarch64-passes.def     |   1 +
>>>>>>  gcc/config/aarch64/aarch64-protos.h       |   3 +-
>>>>>>  gcc/config/aarch64/aarch64-speculation.cc | 494 
>>>>>> ++++++++++++++++++++++++++++++
>>>>>>  gcc/config/aarch64/aarch64.c              |  88 +++++-
>>>>>>  gcc/config/aarch64/aarch64.md             | 140 ++++++++-
>>>>>>  gcc/config/aarch64/aarch64.opt            |   4 +
>>>>>>  gcc/config/aarch64/iterators.md           |   3 +
>>>>>>  gcc/config/aarch64/t-aarch64              |  10 +
>>>>>>  gcc/config/arm/arm.md                     |  21 ++
>>>>>>  gcc/config/arm/unspecs.md                 |   1 +
>>>>>>  gcc/doc/cpp.texi                          |   4 +
>>>>>>  gcc/doc/extend.texi                       |  29 ++
>>>>>>  gcc/doc/invoke.texi                       |  10 +-
>>>>>>  gcc/doc/md.texi                           |  15 +
>>>>>>  gcc/doc/tm.texi                           |  20 ++
>>>>>>  gcc/doc/tm.texi.in                        |   2 +
>>>>>>  gcc/target.def                            |  23 ++
>>>>>>  gcc/targhooks.c                           |  27 ++
>>>>>>  gcc/targhooks.h                           |   2 +
>>>>>>  gcc/testsuite/gcc.dg/spec-barrier-1.c     |  40 +++
>>>>>>  gcc/testsuite/gcc.dg/spec-barrier-2.c     |  19 ++
>>>>>>  gcc/testsuite/gcc.dg/spec-barrier-3.c     |  13 +
>>>>>>  28 files changed, 1191 insertions(+), 11 deletions(-)
>>>>>>  create mode 100644 gcc/config/aarch64/aarch64-speculation.cc
>>>>>>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-1.c
>>>>>>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-2.c
>>>>>>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-3.c
>>>>
>>

Reply via email to