Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-11 Thread Richard Earnshaw (lists)
On 11/07/18 21:46, Jeff Law wrote:
> On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote:
>> On 10/07/18 16:42, Jeff Law wrote:
>>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
 On 10/07/18 00:13, Jeff Law wrote:
> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>
>> To address all of the above, these patches adopt a new approach, based
>> in part on a posting by Chandler Carruth to the LLVM developers list
>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>> but which we have extended to deal with inter-function speculation.
>> The patches divide the problem into two halves.
> We're essentially turning the control dependency into a value that we
> can then use to munge the pointer or the resultant data.
>
>>
>> The first half is some target-specific code to track the speculation
>> condition through the generated code to provide an internal variable
>> which can tell us whether or not the CPU's control flow speculation
>> matches the data flow calculations.  The idea is that the internal
>> variable starts with the value TRUE and if the CPU's control flow
>> speculation ever causes a jump to the wrong block of code the variable
>> becomes false until such time as the incorrect control flow
>> speculation gets unwound.
> Right.
>
> So one of the things that comes immediately to mind is you have to run
> this early enough that you can still get to all the control flow and
> build your predicates.  Otherwise you have do undo stuff like
> conditional move generation.

 No, the opposite, in fact.  We want to run this very late, at least on
 Arm systems (AArch64 or AArch32).  Conditional move instructions are
 fine - they're data-flow operations, not control flow (in fact, that's
 exactly what the control flow tracker instructions are).  By running it
 late we avoid disrupting any of the earlier optimization passes as well.
>>> Ack.  I looked at the aarch64 implementation after sending my message
>>> and it clearly runs very late.
>>>
>>> I haven't convinced myself that all the work generic parts of the
>>> compiler to rewrite and eliminate conditionals is safe.  But even if it
>>> isn't, you're probably getting enough coverage to drastically reduce the
>>> attack surface.  I'm going to have to think about the early
>>> transformations we make and how they interact here harder.  But I think
>>> the general approach can dramatically reduce the attack surface.
>>
>> My argument here would be that we are concerned about speculation that
>> the CPU does with the generated program.  We're not particularly
>> bothered about the abstract machine description it's based upon.  As
>> long as the earlier transforms lead to a valid translation (it hasn't
>> removed a necessary bounds check) then running late is fine.
> I'm thinking about obfuscation of the bounds check or the pointer or
> turning branchy into straightline code, possibly doing some speculation
> in the process, if-conversion and the like.
> 
> For example hoist_adjacent_loads which results in speculative loads and
> likely a conditional move to select between the two loaded values.
> 
> Or what if we've done something like
> 
> if (x < maxval)
>res = *p;
> 
> And we've turned that into
> 
> 
> t = *p;
> res = (x < maxval) ? t : res;

Hmm, interesting.  But for that to be safe, the compiler would have to
be able to prove that dereferencing p was safe even if x >= maxval,
otherwise the run-time code could fault (so if there's any chance that
it could point to something vulnerable, then there must also be a chance
that it points to unmapped memory).  Given that requirement, I don't
think this case can be a specific concern, since the requirement implies
that p must already be within some known bounds for the type of object
it points to.

R.

> 
> 
> That may be implemented as a conditional move at the RTL level, so
> protecting that may be nontrivial.
> 
> In those examples the compiler itself has introduced the speculation.
> 
> I can't find the conditional obfuscation I was looking for, so it's hard
> to rule it in our out as potentially problematical.
> 
> WRT pointer obfuscation, we no longer propagate conditional equivalences
> very agressively, so it may be a non-issue in the end.
> 
> But again, even with these concerns I think what you're doing cuts down
> the attack surface in meaningful ways.
> 
> 
> 
>>
>> I can't currently conceive a situation where the compiler would be able
>> to remove a /necessary/ bounds check that could lead to unsafe
>> speculation later on.  A redundant bounds check removal shouldn't be a
>> problem as the non-redundant check should remain and that will still get
>> tracking code added.
> It's less about removal and more about either compiler-generated
> speculation or obfuscation of the patterns you're looking for.
> 
> 
> jeff
> 
> 
> 
> 



Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-11 Thread Jeff Law
On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote:
> On 10/07/18 16:42, Jeff Law wrote:
>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
>>> On 10/07/18 00:13, Jeff Law wrote:
 On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>
> To address all of the above, these patches adopt a new approach, based
> in part on a posting by Chandler Carruth to the LLVM developers list
> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
> but which we have extended to deal with inter-function speculation.
> The patches divide the problem into two halves.
 We're essentially turning the control dependency into a value that we
 can then use to munge the pointer or the resultant data.

>
> The first half is some target-specific code to track the speculation
> condition through the generated code to provide an internal variable
> which can tell us whether or not the CPU's control flow speculation
> matches the data flow calculations.  The idea is that the internal
> variable starts with the value TRUE and if the CPU's control flow
> speculation ever causes a jump to the wrong block of code the variable
> becomes false until such time as the incorrect control flow
> speculation gets unwound.
 Right.

 So one of the things that comes immediately to mind is you have to run
 this early enough that you can still get to all the control flow and
 build your predicates.  Otherwise you have do undo stuff like
 conditional move generation.
>>>
>>> No, the opposite, in fact.  We want to run this very late, at least on
>>> Arm systems (AArch64 or AArch32).  Conditional move instructions are
>>> fine - they're data-flow operations, not control flow (in fact, that's
>>> exactly what the control flow tracker instructions are).  By running it
>>> late we avoid disrupting any of the earlier optimization passes as well.
>> Ack.  I looked at the aarch64 implementation after sending my message
>> and it clearly runs very late.
>>
>> I haven't convinced myself that all the work generic parts of the
>> compiler to rewrite and eliminate conditionals is safe.  But even if it
>> isn't, you're probably getting enough coverage to drastically reduce the
>> attack surface.  I'm going to have to think about the early
>> transformations we make and how they interact here harder.  But I think
>> the general approach can dramatically reduce the attack surface.
> 
> My argument here would be that we are concerned about speculation that
> the CPU does with the generated program.  We're not particularly
> bothered about the abstract machine description it's based upon.  As
> long as the earlier transforms lead to a valid translation (it hasn't
> removed a necessary bounds check) then running late is fine.
I'm thinking about obfuscation of the bounds check or the pointer or
turning branchy into straightline code, possibly doing some speculation
in the process, if-conversion and the like.

For example hoist_adjacent_loads which results in speculative loads and
likely a conditional move to select between the two loaded values.

Or what if we've done something like

if (x < maxval)
   res = *p;

And we've turned that into


t = *p;
res = (x < maxval) ? t : res;


That may be implemented as a conditional move at the RTL level, so
protecting that may be nontrivial.

In those examples the compiler itself has introduced the speculation.

I can't find the conditional obfuscation I was looking for, so it's hard
to rule it in our out as potentially problematical.

WRT pointer obfuscation, we no longer propagate conditional equivalences
very agressively, so it may be a non-issue in the end.

But again, even with these concerns I think what you're doing cuts down
the attack surface in meaningful ways.



> 
> I can't currently conceive a situation where the compiler would be able
> to remove a /necessary/ bounds check that could lead to unsafe
> speculation later on.  A redundant bounds check removal shouldn't be a
> problem as the non-redundant check should remain and that will still get
> tracking code added.
It's less about removal and more about either compiler-generated
speculation or obfuscation of the patterns you're looking for.


jeff






Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Earnshaw (lists)
On 10/07/18 16:42, Jeff Law wrote:
> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
>> On 10/07/18 00:13, Jeff Law wrote:
>>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:

 To address all of the above, these patches adopt a new approach, based
 in part on a posting by Chandler Carruth to the LLVM developers list
 (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
 but which we have extended to deal with inter-function speculation.
 The patches divide the problem into two halves.
>>> We're essentially turning the control dependency into a value that we
>>> can then use to munge the pointer or the resultant data.
>>>

 The first half is some target-specific code to track the speculation
 condition through the generated code to provide an internal variable
 which can tell us whether or not the CPU's control flow speculation
 matches the data flow calculations.  The idea is that the internal
 variable starts with the value TRUE and if the CPU's control flow
 speculation ever causes a jump to the wrong block of code the variable
 becomes false until such time as the incorrect control flow
 speculation gets unwound.
>>> Right.
>>>
>>> So one of the things that comes immediately to mind is you have to run
>>> this early enough that you can still get to all the control flow and
>>> build your predicates.  Otherwise you have do undo stuff like
>>> conditional move generation.
>>
>> No, the opposite, in fact.  We want to run this very late, at least on
>> Arm systems (AArch64 or AArch32).  Conditional move instructions are
>> fine - they're data-flow operations, not control flow (in fact, that's
>> exactly what the control flow tracker instructions are).  By running it
>> late we avoid disrupting any of the earlier optimization passes as well.
> Ack.  I looked at the aarch64 implementation after sending my message
> and it clearly runs very late.
> 
> I haven't convinced myself that all the work generic parts of the
> compiler to rewrite and eliminate conditionals is safe.  But even if it
> isn't, you're probably getting enough coverage to drastically reduce the
> attack surface.  I'm going to have to think about the early
> transformations we make and how they interact here harder.  But I think
> the general approach can dramatically reduce the attack surface.

My argument here would be that we are concerned about speculation that
the CPU does with the generated program.  We're not particularly
bothered about the abstract machine description it's based upon.  As
long as the earlier transforms lead to a valid translation (it hasn't
removed a necessary bounds check) then running late is fine.

I can't currently conceive a situation where the compiler would be able
to remove a /necessary/ bounds check that could lead to unsafe
speculation later on.  A redundant bounds check removal shouldn't be a
problem as the non-redundant check should remain and that will still get
tracking code added.

> 
> With running very late, as you noted, the big concern is edge
> insertions.  I'm going to have to re-familiarize myself with all the
> rules there :-)I did note you stumbled on some of the issues in that
> space (what to do with calls that throw exceptions).
> 
> Placement before the final bbro pass probably avoids a lot of pain.  So
> the basic placement seems reasonable.  And again, if we're missing
> something due to the effects of earlier passes, I still think you're
> reducing the attack surface in a meaningful way.
> 
> 
> 
>>
>>>
>>> On the flip side, the earlier you do this mitigation, the more you have
>>> to worry about what the optimizers are going to do to the code later in
>>> the pipeline.  It's almost guaranteed a naive implementation is going to
>>> muck this up since we can propagate the state of the condition into the
>>> arms which will make the predicate state a compile time constant.
>>>
>>> In fact this seems to be running into the area of pointer providence and
>>> some discussions we had around atomic a few years back.
>>>
>>> I also wonder if this could be combined with taint analysis to produce a
>>> much lower overhead solution in cases were developers have done analysis
>>> and know what objects are potentially under attacker control.  So
>>> instead of analyzing everything, we can have a much narrower focus.
>>
>> Automatic application of the tracker to vulnerable variables would be
>> nice, but I haven't attempted to go there yet: at present I still rely
>> on the user to annotate code with the new intrinsic.
> ACK.  My sense is we are going to want taint analysis.  I think it'd be
> useful here and in other contexts.  However, I don't think it
> necessarily needs to be a requirement to go forward.
> 
> I'm going to review the atomic discussion we had a while back with the
> kernel folks as well as some pointer providence discussions I've had
> with Martin S.  I can't put my finger on it yet, but I still 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Jeff Law
On 07/10/2018 04:53 AM, Richard Earnshaw (lists) wrote:
> On 10/07/18 11:10, Richard Biener wrote:
>> On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
>>  wrote:
>>>
>>> On 10/07/18 08:19, Richard Biener wrote:
 On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
  wrote:
>
>
> The patches I posted earlier this year for mitigating against
> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
> which it became obvious that a rethink was needed.  This mail, and the
> following patches attempt to address that feedback and present a new
> approach to mitigating against this form of attack surface.
>
> There were two major issues with the original approach:
>
> - The speculation bounds were too tightly constrained - essentially
>   they had to represent and upper and lower bound on a pointer, or a
>   pointer offset.
> - The speculation constraints could only cover the immediately preceding
>   branch, which often did not fit well with the structure of the existing
>   code.
>
> An additional criticism was that the shape of the intrinsic did not
> fit particularly well with systems that used a single speculation
> barrier that essentially had to wait until all preceding speculation
> had to be resolved.
>
> To address all of the above, these patches adopt a new approach, based
> in part on a posting by Chandler Carruth to the LLVM developers list
> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
> but which we have extended to deal with inter-function speculation.
> The patches divide the problem into two halves.
>
> The first half is some target-specific code to track the speculation
> condition through the generated code to provide an internal variable
> which can tell us whether or not the CPU's control flow speculation
> matches the data flow calculations.  The idea is that the internal
> variable starts with the value TRUE and if the CPU's control flow
> speculation ever causes a jump to the wrong block of code the variable
> becomes false until such time as the incorrect control flow
> speculation gets unwound.
>
> The second half is that a new intrinsic function is introduced that is
> much simpler than we had before.  The basic version of the intrinsic
> is now simply:
>
>   T var = __builtin_speculation_safe_value (T unsafe_var);
>
> Full details of the syntax can be found in the documentation patch, in
> patch 1.  In summary, when not speculating the intrinsic returns
> unsafe_var; when speculating then if it can be shown that the
> speculative flow has diverged from the intended control flow then zero
> is returned.  An optional second argument can be used to return an
> alternative value to zero.  The builtin may cause execution to pause
> until the speculation state is resolved.

 So a trivial target implementation would be to emit a barrier and then
 it would always return unsafe_var and never zero.  What I don't understand
 fully is what users should do here, thus what the value of ever returning
 "unsafe" is.  Also I wonder why the API is forcing you to single-out a
 special value instead of doing

  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
  if (!safe)
/* what now? */

 I'm only guessing that the correct way to handle "unsafe" is basically

  while (__builtin_speculation_safe_value (val) == 0)
 ;

 use val, it's now safe
>>>
>>> No, a safe version of val is returned, not a bool telling you it is now
>>> safe to use the original.
>>
>> OK, so making the old value dead is required to preserve the desired
>> dataflow.
>>
>> But how should I use the special value that signaled "failure"?
>>
>> Obviously the user isn't supposed to simply replace 'val' with
>>
>>  val = __builtin_speculation_safe_value (val);
>>
>> to make it speculation-proof.  So - how should the user _use_ this
>> builtin?  The docs do not say anything about this but says the
>> very confusing
>>
>> +The function may use target-dependent speculation tracking state to cause
>> +@var{failval} to be returned when it is known that speculative
>> +execution has incorrectly predicted a conditional branch operation.
>>
>> because speculation is about executing instructions as if they were
>> supposed to be executed.  Once it is known the prediciton was wrong
>> no more "wrong" instructions will be executed but a previously
>> speculated instruction cannot know it was "falsely" speculated.
>>
>> Does the above try to say that the function may return failval if the
>> instruction is currently executed speculatively instead?  That would
>> make sense to me.  And return failval independent of if the speculation
>> later turns out to be correct or not.
>>
>>>  You must use the sanitized 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Jeff Law
On 07/10/2018 08:14 AM, Richard Earnshaw (lists) wrote:
> On 10/07/18 14:48, Bill Schmidt wrote:
>>
>>> On Jul 10, 2018, at 3:49 AM, Richard Earnshaw (lists) 
>>>  wrote:
>>>
>>> On 10/07/18 00:13, Jeff Law wrote:
 On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>
> The patches I posted earlier this year for mitigating against
> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
> which it became obvious that a rethink was needed.  This mail, and the
> following patches attempt to address that feedback and present a new
> approach to mitigating against this form of attack surface.
>
> There were two major issues with the original approach:
>
> - The speculation bounds were too tightly constrained - essentially
>  they had to represent and upper and lower bound on a pointer, or a
>  pointer offset.
> - The speculation constraints could only cover the immediately preceding
>  branch, which often did not fit well with the structure of the existing
>  code.
>
> An additional criticism was that the shape of the intrinsic did not
> fit particularly well with systems that used a single speculation
> barrier that essentially had to wait until all preceding speculation
> had to be resolved.
 Right.  I suggest the Intel and IBM reps chime in on the updated semantics.

>>>
>>> Yes, logically, this is a boolean tracker value.  In practice we use ~0
>>> for true and 0 for false, so that we can simply use it as a mask
>>> operation later.
>>>
>>> I hope this intrinsic will be even more acceptable than the one that
>>> Bill Schmidt acked previously, it's even simpler than the version we had
>>> last time.
>>
>> Yes, I think this looks quite good.  Thanks!
>>
>> Thanks also for digging into the speculation tracking algorithm.  This
>> has good potential as a conservative opt-in approach.  The obvious
>> concern is whether performance will be acceptable even for apps
>> that really want the protection.
>>
>> We took a look at Chandler's WIP LLVM patch and ran some SPEC2006 
>> numbers on a Skylake box.  We saw geomean degradations of about
>> 42% (int) and 33% (fp).  (This was just one test, so caveat emptor.)
>> This isn't terrible given the number of potential false positives and the
>> early state of the algorithm, but it's still a lot from a customer 
>> perspective.
>> I'll be interested in whether your interprocedural improvements are
>> able to reduce the conservatism a bit.
>>
> 
> So I don't have any numbers for SPEC2006.  I have some initial numbers
> for SPEC2000 when just adding the tracking code (so not applying the
> second part of the mitigation).  In that case INT2000 is down by ~13%
> and FP2000 was by comparison almost in the noise (~2.4%).
> 
> Applying the tracker value to all memory loads would push those numbers
> up significantly, I suspect.  That's part of the reason for preferring
> the intrinsic rather than automatic mitigation: the intrinsic is much
> more targeted.
Right.  Fully automatic without any "hints" is going to be very
expensive, possibly prohibitively expensive.

Using the intrinsic or exploiting some kind of taint analysis has the
potential to drastically reduce the overhead.  At least it seems like
they should :-)

jeff


Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Jeff Law
On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
> On 10/07/18 00:13, Jeff Law wrote:
>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>>
>>> To address all of the above, these patches adopt a new approach, based
>>> in part on a posting by Chandler Carruth to the LLVM developers list
>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>>> but which we have extended to deal with inter-function speculation.
>>> The patches divide the problem into two halves.
>> We're essentially turning the control dependency into a value that we
>> can then use to munge the pointer or the resultant data.
>>
>>>
>>> The first half is some target-specific code to track the speculation
>>> condition through the generated code to provide an internal variable
>>> which can tell us whether or not the CPU's control flow speculation
>>> matches the data flow calculations.  The idea is that the internal
>>> variable starts with the value TRUE and if the CPU's control flow
>>> speculation ever causes a jump to the wrong block of code the variable
>>> becomes false until such time as the incorrect control flow
>>> speculation gets unwound.
>> Right.
>>
>> So one of the things that comes immediately to mind is you have to run
>> this early enough that you can still get to all the control flow and
>> build your predicates.  Otherwise you have do undo stuff like
>> conditional move generation.
> 
> No, the opposite, in fact.  We want to run this very late, at least on
> Arm systems (AArch64 or AArch32).  Conditional move instructions are
> fine - they're data-flow operations, not control flow (in fact, that's
> exactly what the control flow tracker instructions are).  By running it
> late we avoid disrupting any of the earlier optimization passes as well.
Ack.  I looked at the aarch64 implementation after sending my message
and it clearly runs very late.

I haven't convinced myself that all the work generic parts of the
compiler to rewrite and eliminate conditionals is safe.  But even if it
isn't, you're probably getting enough coverage to drastically reduce the
attack surface.  I'm going to have to think about the early
transformations we make and how they interact here harder.  But I think
the general approach can dramatically reduce the attack surface.

With running very late, as you noted, the big concern is edge
insertions.  I'm going to have to re-familiarize myself with all the
rules there :-)I did note you stumbled on some of the issues in that
space (what to do with calls that throw exceptions).

Placement before the final bbro pass probably avoids a lot of pain.  So
the basic placement seems reasonable.  And again, if we're missing
something due to the effects of earlier passes, I still think you're
reducing the attack surface in a meaningful way.



> 
>>
>> On the flip side, the earlier you do this mitigation, the more you have
>> to worry about what the optimizers are going to do to the code later in
>> the pipeline.  It's almost guaranteed a naive implementation is going to
>> muck this up since we can propagate the state of the condition into the
>> arms which will make the predicate state a compile time constant.
>>
>> In fact this seems to be running into the area of pointer providence and
>> some discussions we had around atomic a few years back.
>>
>> I also wonder if this could be combined with taint analysis to produce a
>> much lower overhead solution in cases were developers have done analysis
>> and know what objects are potentially under attacker control.  So
>> instead of analyzing everything, we can have a much narrower focus.
> 
> Automatic application of the tracker to vulnerable variables would be
> nice, but I haven't attempted to go there yet: at present I still rely
> on the user to annotate code with the new intrinsic.
ACK.  My sense is we are going to want taint analysis.  I think it'd be
useful here and in other contexts.  However, I don't think it
necessarily needs to be a requirement to go forward.

I'm going to review the atomic discussion we had a while back with the
kernel folks as well as some pointer providence discussions I've had
with Martin S.  I can't put my finger on it yet, but I still have the
sense there's some interactions here we want to at least be aware of.

> 
> That doesn't mean that we couldn't extend the overall approach later to
> include automatic tracking.
Absolutely.

> 
>>
>> The pointer munging could well run afoul of alias analysis engines that
>> don't expect to be seeing those kind of operations.
> 
> I think the pass runs late enough that it isn't a problem.
Yea, I think you're right.


> 
>>
>> Anyway, just some initial high level thoughts.  I'm sure there'll be
>> more as I read the implementation.
>>
> 
> Thanks for starting to look at this so quickly.
NP.  Your timing to come back to this is good.

Jeff


Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Earnshaw (lists)
On 10/07/18 14:48, Bill Schmidt wrote:
> 
>> On Jul 10, 2018, at 3:49 AM, Richard Earnshaw (lists) 
>>  wrote:
>>
>> On 10/07/18 00:13, Jeff Law wrote:
>>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:

 The patches I posted earlier this year for mitigating against
 CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
 which it became obvious that a rethink was needed.  This mail, and the
 following patches attempt to address that feedback and present a new
 approach to mitigating against this form of attack surface.

 There were two major issues with the original approach:

 - The speculation bounds were too tightly constrained - essentially
  they had to represent and upper and lower bound on a pointer, or a
  pointer offset.
 - The speculation constraints could only cover the immediately preceding
  branch, which often did not fit well with the structure of the existing
  code.

 An additional criticism was that the shape of the intrinsic did not
 fit particularly well with systems that used a single speculation
 barrier that essentially had to wait until all preceding speculation
 had to be resolved.
>>> Right.  I suggest the Intel and IBM reps chime in on the updated semantics.
>>>
>>
>> Yes, logically, this is a boolean tracker value.  In practice we use ~0
>> for true and 0 for false, so that we can simply use it as a mask
>> operation later.
>>
>> I hope this intrinsic will be even more acceptable than the one that
>> Bill Schmidt acked previously, it's even simpler than the version we had
>> last time.
> 
> Yes, I think this looks quite good.  Thanks!
> 
> Thanks also for digging into the speculation tracking algorithm.  This
> has good potential as a conservative opt-in approach.  The obvious
> concern is whether performance will be acceptable even for apps
> that really want the protection.
> 
> We took a look at Chandler's WIP LLVM patch and ran some SPEC2006 
> numbers on a Skylake box.  We saw geomean degradations of about
> 42% (int) and 33% (fp).  (This was just one test, so caveat emptor.)
> This isn't terrible given the number of potential false positives and the
> early state of the algorithm, but it's still a lot from a customer 
> perspective.
> I'll be interested in whether your interprocedural improvements are
> able to reduce the conservatism a bit.
> 

So I don't have any numbers for SPEC2006.  I have some initial numbers
for SPEC2000 when just adding the tracking code (so not applying the
second part of the mitigation).  In that case INT2000 is down by ~13%
and FP2000 was by comparison almost in the noise (~2.4%).

Applying the tracker value to all memory loads would push those numbers
up significantly, I suspect.  That's part of the reason for preferring
the intrinsic rather than automatic mitigation: the intrinsic is much
more targeted.

R.


> Thanks,
> Bill
>>

 To address all of the above, these patches adopt a new approach, based
 in part on a posting by Chandler Carruth to the LLVM developers list
 (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
 but which we have extended to deal with inter-function speculation.
 The patches divide the problem into two halves.
>>> We're essentially turning the control dependency into a value that we
>>> can then use to munge the pointer or the resultant data.
>>>

 The first half is some target-specific code to track the speculation
 condition through the generated code to provide an internal variable
 which can tell us whether or not the CPU's control flow speculation
 matches the data flow calculations.  The idea is that the internal
 variable starts with the value TRUE and if the CPU's control flow
 speculation ever causes a jump to the wrong block of code the variable
 becomes false until such time as the incorrect control flow
 speculation gets unwound.
>>> Right.
>>>
>>> So one of the things that comes immediately to mind is you have to run
>>> this early enough that you can still get to all the control flow and
>>> build your predicates.  Otherwise you have do undo stuff like
>>> conditional move generation.
>>
>> No, the opposite, in fact.  We want to run this very late, at least on
>> Arm systems (AArch64 or AArch32).  Conditional move instructions are
>> fine - they're data-flow operations, not control flow (in fact, that's
>> exactly what the control flow tracker instructions are).  By running it
>> late we avoid disrupting any of the earlier optimization passes as well.
>>
>>>
>>> On the flip side, the earlier you do this mitigation, the more you have
>>> to worry about what the optimizers are going to do to the code later in
>>> the pipeline.  It's almost guaranteed a naive implementation is going to
>>> muck this up since we can propagate the state of the condition into the
>>> arms which will make the predicate state a compile time constant.
>>>

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Bill Schmidt


> On Jul 10, 2018, at 3:49 AM, Richard Earnshaw (lists) 
>  wrote:
> 
> On 10/07/18 00:13, Jeff Law wrote:
>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>> 
>>> The patches I posted earlier this year for mitigating against
>>> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>>> which it became obvious that a rethink was needed.  This mail, and the
>>> following patches attempt to address that feedback and present a new
>>> approach to mitigating against this form of attack surface.
>>> 
>>> There were two major issues with the original approach:
>>> 
>>> - The speculation bounds were too tightly constrained - essentially
>>>  they had to represent and upper and lower bound on a pointer, or a
>>>  pointer offset.
>>> - The speculation constraints could only cover the immediately preceding
>>>  branch, which often did not fit well with the structure of the existing
>>>  code.
>>> 
>>> An additional criticism was that the shape of the intrinsic did not
>>> fit particularly well with systems that used a single speculation
>>> barrier that essentially had to wait until all preceding speculation
>>> had to be resolved.
>> Right.  I suggest the Intel and IBM reps chime in on the updated semantics.
>> 
> 
> Yes, logically, this is a boolean tracker value.  In practice we use ~0
> for true and 0 for false, so that we can simply use it as a mask
> operation later.
> 
> I hope this intrinsic will be even more acceptable than the one that
> Bill Schmidt acked previously, it's even simpler than the version we had
> last time.

Yes, I think this looks quite good.  Thanks!

Thanks also for digging into the speculation tracking algorithm.  This
has good potential as a conservative opt-in approach.  The obvious
concern is whether performance will be acceptable even for apps
that really want the protection.

We took a look at Chandler's WIP LLVM patch and ran some SPEC2006 
numbers on a Skylake box.  We saw geomean degradations of about
42% (int) and 33% (fp).  (This was just one test, so caveat emptor.)
This isn't terrible given the number of potential false positives and the
early state of the algorithm, but it's still a lot from a customer perspective.
I'll be interested in whether your interprocedural improvements are
able to reduce the conservatism a bit.

Thanks,
Bill
> 
>>> 
>>> To address all of the above, these patches adopt a new approach, based
>>> in part on a posting by Chandler Carruth to the LLVM developers list
>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>>> but which we have extended to deal with inter-function speculation.
>>> The patches divide the problem into two halves.
>> We're essentially turning the control dependency into a value that we
>> can then use to munge the pointer or the resultant data.
>> 
>>> 
>>> The first half is some target-specific code to track the speculation
>>> condition through the generated code to provide an internal variable
>>> which can tell us whether or not the CPU's control flow speculation
>>> matches the data flow calculations.  The idea is that the internal
>>> variable starts with the value TRUE and if the CPU's control flow
>>> speculation ever causes a jump to the wrong block of code the variable
>>> becomes false until such time as the incorrect control flow
>>> speculation gets unwound.
>> Right.
>> 
>> So one of the things that comes immediately to mind is you have to run
>> this early enough that you can still get to all the control flow and
>> build your predicates.  Otherwise you have do undo stuff like
>> conditional move generation.
> 
> No, the opposite, in fact.  We want to run this very late, at least on
> Arm systems (AArch64 or AArch32).  Conditional move instructions are
> fine - they're data-flow operations, not control flow (in fact, that's
> exactly what the control flow tracker instructions are).  By running it
> late we avoid disrupting any of the earlier optimization passes as well.
> 
>> 
>> On the flip side, the earlier you do this mitigation, the more you have
>> to worry about what the optimizers are going to do to the code later in
>> the pipeline.  It's almost guaranteed a naive implementation is going to
>> muck this up since we can propagate the state of the condition into the
>> arms which will make the predicate state a compile time constant.
>> 
>> In fact this seems to be running into the area of pointer providence and
>> some discussions we had around atomic a few years back.
>> 
>> I also wonder if this could be combined with taint analysis to produce a
>> much lower overhead solution in cases were developers have done analysis
>> and know what objects are potentially under attacker control.  So
>> instead of analyzing everything, we can have a much narrower focus.
> 
> Automatic application of the tracker to vulnerable variables would be
> nice, but I haven't attempted to go there yet: at present I still rely
> on the user to annotate code with the new intrinsic.
> 
> 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Earnshaw (lists)
On 10/07/18 12:21, Richard Biener wrote:
> On Tue, Jul 10, 2018 at 12:53 PM Richard Earnshaw (lists)
>  wrote:
>>
>> On 10/07/18 11:10, Richard Biener wrote:
>>> On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
>>>  wrote:

 On 10/07/18 08:19, Richard Biener wrote:
> On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
>  wrote:
>>
>>
>> The patches I posted earlier this year for mitigating against
>> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>> which it became obvious that a rethink was needed.  This mail, and the
>> following patches attempt to address that feedback and present a new
>> approach to mitigating against this form of attack surface.
>>
>> There were two major issues with the original approach:
>>
>> - The speculation bounds were too tightly constrained - essentially
>>   they had to represent and upper and lower bound on a pointer, or a
>>   pointer offset.
>> - The speculation constraints could only cover the immediately preceding
>>   branch, which often did not fit well with the structure of the existing
>>   code.
>>
>> An additional criticism was that the shape of the intrinsic did not
>> fit particularly well with systems that used a single speculation
>> barrier that essentially had to wait until all preceding speculation
>> had to be resolved.
>>
>> To address all of the above, these patches adopt a new approach, based
>> in part on a posting by Chandler Carruth to the LLVM developers list
>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>> but which we have extended to deal with inter-function speculation.
>> The patches divide the problem into two halves.
>>
>> The first half is some target-specific code to track the speculation
>> condition through the generated code to provide an internal variable
>> which can tell us whether or not the CPU's control flow speculation
>> matches the data flow calculations.  The idea is that the internal
>> variable starts with the value TRUE and if the CPU's control flow
>> speculation ever causes a jump to the wrong block of code the variable
>> becomes false until such time as the incorrect control flow
>> speculation gets unwound.
>>
>> The second half is that a new intrinsic function is introduced that is
>> much simpler than we had before.  The basic version of the intrinsic
>> is now simply:
>>
>>   T var = __builtin_speculation_safe_value (T unsafe_var);
>>
>> Full details of the syntax can be found in the documentation patch, in
>> patch 1.  In summary, when not speculating the intrinsic returns
>> unsafe_var; when speculating then if it can be shown that the
>> speculative flow has diverged from the intended control flow then zero
>> is returned.  An optional second argument can be used to return an
>> alternative value to zero.  The builtin may cause execution to pause
>> until the speculation state is resolved.
>
> So a trivial target implementation would be to emit a barrier and then
> it would always return unsafe_var and never zero.  What I don't understand
> fully is what users should do here, thus what the value of ever returning
> "unsafe" is.  Also I wonder why the API is forcing you to single-out a
> special value instead of doing
>
>  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
>  if (!safe)
>/* what now? */
>
> I'm only guessing that the correct way to handle "unsafe" is basically
>
>  while (__builtin_speculation_safe_value (val) == 0)
> ;
>
> use val, it's now safe

 No, a safe version of val is returned, not a bool telling you it is now
 safe to use the original.
>>>
>>> OK, so making the old value dead is required to preserve the desired
>>> dataflow.
>>>
>>> But how should I use the special value that signaled "failure"?
>>>
>>> Obviously the user isn't supposed to simply replace 'val' with
>>>
>>>  val = __builtin_speculation_safe_value (val);
>>>
>>> to make it speculation-proof.  So - how should the user _use_ this
>>> builtin?  The docs do not say anything about this but says the
>>> very confusing
>>>
>>> +The function may use target-dependent speculation tracking state to cause
>>> +@var{failval} to be returned when it is known that speculative
>>> +execution has incorrectly predicted a conditional branch operation.
>>>
>>> because speculation is about executing instructions as if they were
>>> supposed to be executed.  Once it is known the prediciton was wrong
>>> no more "wrong" instructions will be executed but a previously
>>> speculated instruction cannot know it was "falsely" speculated.
>>>
>>> Does the above try to say that the function may return failval if the
>>> instruction is currently executed speculatively instead?  That 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Biener
On Tue, Jul 10, 2018 at 12:53 PM Richard Earnshaw (lists)
 wrote:
>
> On 10/07/18 11:10, Richard Biener wrote:
> > On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
> >  wrote:
> >>
> >> On 10/07/18 08:19, Richard Biener wrote:
> >>> On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
> >>>  wrote:
> 
> 
>  The patches I posted earlier this year for mitigating against
>  CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>  which it became obvious that a rethink was needed.  This mail, and the
>  following patches attempt to address that feedback and present a new
>  approach to mitigating against this form of attack surface.
> 
>  There were two major issues with the original approach:
> 
>  - The speculation bounds were too tightly constrained - essentially
>    they had to represent and upper and lower bound on a pointer, or a
>    pointer offset.
>  - The speculation constraints could only cover the immediately preceding
>    branch, which often did not fit well with the structure of the existing
>    code.
> 
>  An additional criticism was that the shape of the intrinsic did not
>  fit particularly well with systems that used a single speculation
>  barrier that essentially had to wait until all preceding speculation
>  had to be resolved.
> 
>  To address all of the above, these patches adopt a new approach, based
>  in part on a posting by Chandler Carruth to the LLVM developers list
>  (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>  but which we have extended to deal with inter-function speculation.
>  The patches divide the problem into two halves.
> 
>  The first half is some target-specific code to track the speculation
>  condition through the generated code to provide an internal variable
>  which can tell us whether or not the CPU's control flow speculation
>  matches the data flow calculations.  The idea is that the internal
>  variable starts with the value TRUE and if the CPU's control flow
>  speculation ever causes a jump to the wrong block of code the variable
>  becomes false until such time as the incorrect control flow
>  speculation gets unwound.
> 
>  The second half is that a new intrinsic function is introduced that is
>  much simpler than we had before.  The basic version of the intrinsic
>  is now simply:
> 
>    T var = __builtin_speculation_safe_value (T unsafe_var);
> 
>  Full details of the syntax can be found in the documentation patch, in
>  patch 1.  In summary, when not speculating the intrinsic returns
>  unsafe_var; when speculating then if it can be shown that the
>  speculative flow has diverged from the intended control flow then zero
>  is returned.  An optional second argument can be used to return an
>  alternative value to zero.  The builtin may cause execution to pause
>  until the speculation state is resolved.
> >>>
> >>> So a trivial target implementation would be to emit a barrier and then
> >>> it would always return unsafe_var and never zero.  What I don't understand
> >>> fully is what users should do here, thus what the value of ever returning
> >>> "unsafe" is.  Also I wonder why the API is forcing you to single-out a
> >>> special value instead of doing
> >>>
> >>>  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
> >>>  if (!safe)
> >>>/* what now? */
> >>>
> >>> I'm only guessing that the correct way to handle "unsafe" is basically
> >>>
> >>>  while (__builtin_speculation_safe_value (val) == 0)
> >>> ;
> >>>
> >>> use val, it's now safe
> >>
> >> No, a safe version of val is returned, not a bool telling you it is now
> >> safe to use the original.
> >
> > OK, so making the old value dead is required to preserve the desired
> > dataflow.
> >
> > But how should I use the special value that signaled "failure"?
> >
> > Obviously the user isn't supposed to simply replace 'val' with
> >
> >  val = __builtin_speculation_safe_value (val);
> >
> > to make it speculation-proof.  So - how should the user _use_ this
> > builtin?  The docs do not say anything about this but says the
> > very confusing
> >
> > +The function may use target-dependent speculation tracking state to cause
> > +@var{failval} to be returned when it is known that speculative
> > +execution has incorrectly predicted a conditional branch operation.
> >
> > because speculation is about executing instructions as if they were
> > supposed to be executed.  Once it is known the prediciton was wrong
> > no more "wrong" instructions will be executed but a previously
> > speculated instruction cannot know it was "falsely" speculated.
> >
> > Does the above try to say that the function may return failval if the
> > instruction is currently executed speculatively instead?  That would
> > make sense to me.  And return failval 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Earnshaw (lists)
On 10/07/18 11:10, Richard Biener wrote:
> On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
>  wrote:
>>
>> On 10/07/18 08:19, Richard Biener wrote:
>>> On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
>>>  wrote:


 The patches I posted earlier this year for mitigating against
 CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
 which it became obvious that a rethink was needed.  This mail, and the
 following patches attempt to address that feedback and present a new
 approach to mitigating against this form of attack surface.

 There were two major issues with the original approach:

 - The speculation bounds were too tightly constrained - essentially
   they had to represent and upper and lower bound on a pointer, or a
   pointer offset.
 - The speculation constraints could only cover the immediately preceding
   branch, which often did not fit well with the structure of the existing
   code.

 An additional criticism was that the shape of the intrinsic did not
 fit particularly well with systems that used a single speculation
 barrier that essentially had to wait until all preceding speculation
 had to be resolved.

 To address all of the above, these patches adopt a new approach, based
 in part on a posting by Chandler Carruth to the LLVM developers list
 (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
 but which we have extended to deal with inter-function speculation.
 The patches divide the problem into two halves.

 The first half is some target-specific code to track the speculation
 condition through the generated code to provide an internal variable
 which can tell us whether or not the CPU's control flow speculation
 matches the data flow calculations.  The idea is that the internal
 variable starts with the value TRUE and if the CPU's control flow
 speculation ever causes a jump to the wrong block of code the variable
 becomes false until such time as the incorrect control flow
 speculation gets unwound.

 The second half is that a new intrinsic function is introduced that is
 much simpler than we had before.  The basic version of the intrinsic
 is now simply:

   T var = __builtin_speculation_safe_value (T unsafe_var);

 Full details of the syntax can be found in the documentation patch, in
 patch 1.  In summary, when not speculating the intrinsic returns
 unsafe_var; when speculating then if it can be shown that the
 speculative flow has diverged from the intended control flow then zero
 is returned.  An optional second argument can be used to return an
 alternative value to zero.  The builtin may cause execution to pause
 until the speculation state is resolved.
>>>
>>> So a trivial target implementation would be to emit a barrier and then
>>> it would always return unsafe_var and never zero.  What I don't understand
>>> fully is what users should do here, thus what the value of ever returning
>>> "unsafe" is.  Also I wonder why the API is forcing you to single-out a
>>> special value instead of doing
>>>
>>>  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
>>>  if (!safe)
>>>/* what now? */
>>>
>>> I'm only guessing that the correct way to handle "unsafe" is basically
>>>
>>>  while (__builtin_speculation_safe_value (val) == 0)
>>> ;
>>>
>>> use val, it's now safe
>>
>> No, a safe version of val is returned, not a bool telling you it is now
>> safe to use the original.
> 
> OK, so making the old value dead is required to preserve the desired
> dataflow.
> 
> But how should I use the special value that signaled "failure"?
> 
> Obviously the user isn't supposed to simply replace 'val' with
> 
>  val = __builtin_speculation_safe_value (val);
> 
> to make it speculation-proof.  So - how should the user _use_ this
> builtin?  The docs do not say anything about this but says the
> very confusing
> 
> +The function may use target-dependent speculation tracking state to cause
> +@var{failval} to be returned when it is known that speculative
> +execution has incorrectly predicted a conditional branch operation.
> 
> because speculation is about executing instructions as if they were
> supposed to be executed.  Once it is known the prediciton was wrong
> no more "wrong" instructions will be executed but a previously
> speculated instruction cannot know it was "falsely" speculated.
> 
> Does the above try to say that the function may return failval if the
> instruction is currently executed speculatively instead?  That would
> make sense to me.  And return failval independent of if the speculation
> later turns out to be correct or not.
> 
>>  You must use the sanitized version in future,
>> not the unprotected version.
>>
>>
>> So the usage is going to be more like:
>>
>> val = __builtin_speculation_safe_value (val);  // 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Biener
On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
 wrote:
>
> On 10/07/18 08:19, Richard Biener wrote:
> > On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
> >  wrote:
> >>
> >>
> >> The patches I posted earlier this year for mitigating against
> >> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
> >> which it became obvious that a rethink was needed.  This mail, and the
> >> following patches attempt to address that feedback and present a new
> >> approach to mitigating against this form of attack surface.
> >>
> >> There were two major issues with the original approach:
> >>
> >> - The speculation bounds were too tightly constrained - essentially
> >>   they had to represent and upper and lower bound on a pointer, or a
> >>   pointer offset.
> >> - The speculation constraints could only cover the immediately preceding
> >>   branch, which often did not fit well with the structure of the existing
> >>   code.
> >>
> >> An additional criticism was that the shape of the intrinsic did not
> >> fit particularly well with systems that used a single speculation
> >> barrier that essentially had to wait until all preceding speculation
> >> had to be resolved.
> >>
> >> To address all of the above, these patches adopt a new approach, based
> >> in part on a posting by Chandler Carruth to the LLVM developers list
> >> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
> >> but which we have extended to deal with inter-function speculation.
> >> The patches divide the problem into two halves.
> >>
> >> The first half is some target-specific code to track the speculation
> >> condition through the generated code to provide an internal variable
> >> which can tell us whether or not the CPU's control flow speculation
> >> matches the data flow calculations.  The idea is that the internal
> >> variable starts with the value TRUE and if the CPU's control flow
> >> speculation ever causes a jump to the wrong block of code the variable
> >> becomes false until such time as the incorrect control flow
> >> speculation gets unwound.
> >>
> >> The second half is that a new intrinsic function is introduced that is
> >> much simpler than we had before.  The basic version of the intrinsic
> >> is now simply:
> >>
> >>   T var = __builtin_speculation_safe_value (T unsafe_var);
> >>
> >> Full details of the syntax can be found in the documentation patch, in
> >> patch 1.  In summary, when not speculating the intrinsic returns
> >> unsafe_var; when speculating then if it can be shown that the
> >> speculative flow has diverged from the intended control flow then zero
> >> is returned.  An optional second argument can be used to return an
> >> alternative value to zero.  The builtin may cause execution to pause
> >> until the speculation state is resolved.
> >
> > So a trivial target implementation would be to emit a barrier and then
> > it would always return unsafe_var and never zero.  What I don't understand
> > fully is what users should do here, thus what the value of ever returning
> > "unsafe" is.  Also I wonder why the API is forcing you to single-out a
> > special value instead of doing
> >
> >  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
> >  if (!safe)
> >/* what now? */
> >
> > I'm only guessing that the correct way to handle "unsafe" is basically
> >
> >  while (__builtin_speculation_safe_value (val) == 0)
> > ;
> >
> > use val, it's now safe
>
> No, a safe version of val is returned, not a bool telling you it is now
> safe to use the original.

OK, so making the old value dead is required to preserve the desired
dataflow.

But how should I use the special value that signaled "failure"?

Obviously the user isn't supposed to simply replace 'val' with

 val = __builtin_speculation_safe_value (val);

to make it speculation-proof.  So - how should the user _use_ this
builtin?  The docs do not say anything about this but says the
very confusing

+The function may use target-dependent speculation tracking state to cause
+@var{failval} to be returned when it is known that speculative
+execution has incorrectly predicted a conditional branch operation.

because speculation is about executing instructions as if they were
supposed to be executed.  Once it is known the prediciton was wrong
no more "wrong" instructions will be executed but a previously
speculated instruction cannot know it was "falsely" speculated.

Does the above try to say that the function may return failval if the
instruction is currently executed speculatively instead?  That would
make sense to me.  And return failval independent of if the speculation
later turns out to be correct or not.

>  You must use the sanitized version in future,
> not the unprotected version.
>
>
> So the usage is going to be more like:
>
> val = __builtin_speculation_safe_value (val);  // Overwrite val with a
> sanitized version.
>
> You have to use the cleaned up version, the unclean version is still
> 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Earnshaw (lists)
On 10/07/18 00:13, Jeff Law wrote:
> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>
>> The patches I posted earlier this year for mitigating against
>> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>> which it became obvious that a rethink was needed.  This mail, and the
>> following patches attempt to address that feedback and present a new
>> approach to mitigating against this form of attack surface.
>>
>> There were two major issues with the original approach:
>>
>> - The speculation bounds were too tightly constrained - essentially
>>   they had to represent and upper and lower bound on a pointer, or a
>>   pointer offset.
>> - The speculation constraints could only cover the immediately preceding
>>   branch, which often did not fit well with the structure of the existing
>>   code.
>>
>> An additional criticism was that the shape of the intrinsic did not
>> fit particularly well with systems that used a single speculation
>> barrier that essentially had to wait until all preceding speculation
>> had to be resolved.
> Right.  I suggest the Intel and IBM reps chime in on the updated semantics.
> 

Yes, logically, this is a boolean tracker value.  In practice we use ~0
for true and 0 for false, so that we can simply use it as a mask
operation later.

I hope this intrinsic will be even more acceptable than the one that
Bill Schmidt acked previously, it's even simpler than the version we had
last time.

>>
>> To address all of the above, these patches adopt a new approach, based
>> in part on a posting by Chandler Carruth to the LLVM developers list
>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>> but which we have extended to deal with inter-function speculation.
>> The patches divide the problem into two halves.
> We're essentially turning the control dependency into a value that we
> can then use to munge the pointer or the resultant data.
> 
>>
>> The first half is some target-specific code to track the speculation
>> condition through the generated code to provide an internal variable
>> which can tell us whether or not the CPU's control flow speculation
>> matches the data flow calculations.  The idea is that the internal
>> variable starts with the value TRUE and if the CPU's control flow
>> speculation ever causes a jump to the wrong block of code the variable
>> becomes false until such time as the incorrect control flow
>> speculation gets unwound.
> Right.
> 
> So one of the things that comes immediately to mind is you have to run
> this early enough that you can still get to all the control flow and
> build your predicates.  Otherwise you have do undo stuff like
> conditional move generation.

No, the opposite, in fact.  We want to run this very late, at least on
Arm systems (AArch64 or AArch32).  Conditional move instructions are
fine - they're data-flow operations, not control flow (in fact, that's
exactly what the control flow tracker instructions are).  By running it
late we avoid disrupting any of the earlier optimization passes as well.

> 
> On the flip side, the earlier you do this mitigation, the more you have
> to worry about what the optimizers are going to do to the code later in
> the pipeline.  It's almost guaranteed a naive implementation is going to
> muck this up since we can propagate the state of the condition into the
> arms which will make the predicate state a compile time constant.
> 
> In fact this seems to be running into the area of pointer providence and
> some discussions we had around atomic a few years back.
> 
> I also wonder if this could be combined with taint analysis to produce a
> much lower overhead solution in cases were developers have done analysis
> and know what objects are potentially under attacker control.  So
> instead of analyzing everything, we can have a much narrower focus.

Automatic application of the tracker to vulnerable variables would be
nice, but I haven't attempted to go there yet: at present I still rely
on the user to annotate code with the new intrinsic.

That doesn't mean that we couldn't extend the overall approach later to
include automatic tracking.

> 
> The pointer munging could well run afoul of alias analysis engines that
> don't expect to be seeing those kind of operations.

I think the pass runs late enough that it isn't a problem.

> 
> Anyway, just some initial high level thoughts.  I'm sure there'll be
> more as I read the implementation.
> 

Thanks for starting to look at this so quickly.

R.

> 
> Jeff
> 



Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Earnshaw (lists)
On 10/07/18 08:19, Richard Biener wrote:
> On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
>  wrote:
>>
>>
>> The patches I posted earlier this year for mitigating against
>> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>> which it became obvious that a rethink was needed.  This mail, and the
>> following patches attempt to address that feedback and present a new
>> approach to mitigating against this form of attack surface.
>>
>> There were two major issues with the original approach:
>>
>> - The speculation bounds were too tightly constrained - essentially
>>   they had to represent and upper and lower bound on a pointer, or a
>>   pointer offset.
>> - The speculation constraints could only cover the immediately preceding
>>   branch, which often did not fit well with the structure of the existing
>>   code.
>>
>> An additional criticism was that the shape of the intrinsic did not
>> fit particularly well with systems that used a single speculation
>> barrier that essentially had to wait until all preceding speculation
>> had to be resolved.
>>
>> To address all of the above, these patches adopt a new approach, based
>> in part on a posting by Chandler Carruth to the LLVM developers list
>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>> but which we have extended to deal with inter-function speculation.
>> The patches divide the problem into two halves.
>>
>> The first half is some target-specific code to track the speculation
>> condition through the generated code to provide an internal variable
>> which can tell us whether or not the CPU's control flow speculation
>> matches the data flow calculations.  The idea is that the internal
>> variable starts with the value TRUE and if the CPU's control flow
>> speculation ever causes a jump to the wrong block of code the variable
>> becomes false until such time as the incorrect control flow
>> speculation gets unwound.
>>
>> The second half is that a new intrinsic function is introduced that is
>> much simpler than we had before.  The basic version of the intrinsic
>> is now simply:
>>
>>   T var = __builtin_speculation_safe_value (T unsafe_var);
>>
>> Full details of the syntax can be found in the documentation patch, in
>> patch 1.  In summary, when not speculating the intrinsic returns
>> unsafe_var; when speculating then if it can be shown that the
>> speculative flow has diverged from the intended control flow then zero
>> is returned.  An optional second argument can be used to return an
>> alternative value to zero.  The builtin may cause execution to pause
>> until the speculation state is resolved.
> 
> So a trivial target implementation would be to emit a barrier and then
> it would always return unsafe_var and never zero.  What I don't understand
> fully is what users should do here, thus what the value of ever returning
> "unsafe" is.  Also I wonder why the API is forcing you to single-out a
> special value instead of doing
> 
>  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
>  if (!safe)
>/* what now? */
> 
> I'm only guessing that the correct way to handle "unsafe" is basically
> 
>  while (__builtin_speculation_safe_value (val) == 0)
> ;
> 
> use val, it's now safe

No, a safe version of val is returned, not a bool telling you it is now
safe to use the original.  You must use the sanitized version in future,
not the unprotected version.


So the usage is going to be more like:

val = __builtin_speculation_safe_value (val);  // Overwrite val with a
sanitized version.

You have to use the cleaned up version, the unclean version is still
vulnerable to incorrect speculation.

R.

> 
> that is, the return value is only interesting in sofar as to whether it is 
> equal
> to val or the special value?
> 
> That said, I wonder why we don't hide that details from the user and
> provide a predicate instead.
> 
> Richard.
> 
>> There are seven patches in this set, as follows.
>>
>> 1) Introduces the new intrinsic __builtin_sepculation_safe_value.
>> 2) Adds a basic hard barrier implementation for AArch32 (arm) state.
>> 3) Adds a basic hard barrier implementation for AArch64 state.
>> 4) Adds a new command-line option -mtrack-speculation (currently a no-op).
>> 5) Disables CB[N]Z and TB[N]Z when -mtrack-speculation.
>> 6) Adds the new speculation tracking pass for AArch64
>> 7) Uses the new speculation tracking pass to generate CSDB-based barrier
>>sequences
>>
>> I haven't added a speculation-tracking pass for AArch32 at this time.
>> It is possible to do this, but would require quite a lot of rework for
>> the arm backend due to the limited number of registers that are
>> available.
>>
>> Although patch 6 is AArch64 specific, I'd appreciate a review from
>> someone more familiar with the branch edge code than myself.  There
>> appear to be a number of tricky issues with more complex edges so I'd
>> like a second opinion on that code in case I've missed an important
>> 

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-10 Thread Richard Biener
On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
 wrote:
>
>
> The patches I posted earlier this year for mitigating against
> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
> which it became obvious that a rethink was needed.  This mail, and the
> following patches attempt to address that feedback and present a new
> approach to mitigating against this form of attack surface.
>
> There were two major issues with the original approach:
>
> - The speculation bounds were too tightly constrained - essentially
>   they had to represent and upper and lower bound on a pointer, or a
>   pointer offset.
> - The speculation constraints could only cover the immediately preceding
>   branch, which often did not fit well with the structure of the existing
>   code.
>
> An additional criticism was that the shape of the intrinsic did not
> fit particularly well with systems that used a single speculation
> barrier that essentially had to wait until all preceding speculation
> had to be resolved.
>
> To address all of the above, these patches adopt a new approach, based
> in part on a posting by Chandler Carruth to the LLVM developers list
> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
> but which we have extended to deal with inter-function speculation.
> The patches divide the problem into two halves.
>
> The first half is some target-specific code to track the speculation
> condition through the generated code to provide an internal variable
> which can tell us whether or not the CPU's control flow speculation
> matches the data flow calculations.  The idea is that the internal
> variable starts with the value TRUE and if the CPU's control flow
> speculation ever causes a jump to the wrong block of code the variable
> becomes false until such time as the incorrect control flow
> speculation gets unwound.
>
> The second half is that a new intrinsic function is introduced that is
> much simpler than we had before.  The basic version of the intrinsic
> is now simply:
>
>   T var = __builtin_speculation_safe_value (T unsafe_var);
>
> Full details of the syntax can be found in the documentation patch, in
> patch 1.  In summary, when not speculating the intrinsic returns
> unsafe_var; when speculating then if it can be shown that the
> speculative flow has diverged from the intended control flow then zero
> is returned.  An optional second argument can be used to return an
> alternative value to zero.  The builtin may cause execution to pause
> until the speculation state is resolved.

So a trivial target implementation would be to emit a barrier and then
it would always return unsafe_var and never zero.  What I don't understand
fully is what users should do here, thus what the value of ever returning
"unsafe" is.  Also I wonder why the API is forcing you to single-out a
special value instead of doing

 bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
 if (!safe)
   /* what now? */

I'm only guessing that the correct way to handle "unsafe" is basically

 while (__builtin_speculation_safe_value (val) == 0)
;

use val, it's now safe

that is, the return value is only interesting in sofar as to whether it is equal
to val or the special value?

That said, I wonder why we don't hide that details from the user and
provide a predicate instead.

Richard.

> There are seven patches in this set, as follows.
>
> 1) Introduces the new intrinsic __builtin_sepculation_safe_value.
> 2) Adds a basic hard barrier implementation for AArch32 (arm) state.
> 3) Adds a basic hard barrier implementation for AArch64 state.
> 4) Adds a new command-line option -mtrack-speculation (currently a no-op).
> 5) Disables CB[N]Z and TB[N]Z when -mtrack-speculation.
> 6) Adds the new speculation tracking pass for AArch64
> 7) Uses the new speculation tracking pass to generate CSDB-based barrier
>sequences
>
> I haven't added a speculation-tracking pass for AArch32 at this time.
> It is possible to do this, but would require quite a lot of rework for
> the arm backend due to the limited number of registers that are
> available.
>
> Although patch 6 is AArch64 specific, I'd appreciate a review from
> someone more familiar with the branch edge code than myself.  There
> appear to be a number of tricky issues with more complex edges so I'd
> like a second opinion on that code in case I've missed an important
> case.
>
> R.
>
>
>
> Richard Earnshaw (7):
>   Add __builtin_speculation_safe_value
>   Arm - add speculation_barrier pattern
>   AArch64 - add speculation barrier
>   AArch64 - Add new option -mtrack-speculation
>   AArch64 - disable CB[N]Z TB[N]Z when tracking speculation
>   AArch64 - new pass to add conditional-branch speculation tracking
>   AArch64 - use CSDB based sequences if speculation tracking is enabled
>
>  gcc/builtin-types.def |   6 +
>  gcc/builtins.c|  57 
>  gcc/builtins.def  |  20 ++
>  

Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-09 Thread Jeff Law
On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
> 
> The patches I posted earlier this year for mitigating against
> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
> which it became obvious that a rethink was needed.  This mail, and the
> following patches attempt to address that feedback and present a new
> approach to mitigating against this form of attack surface.
> 
> There were two major issues with the original approach:
> 
> - The speculation bounds were too tightly constrained - essentially
>   they had to represent and upper and lower bound on a pointer, or a
>   pointer offset.
> - The speculation constraints could only cover the immediately preceding
>   branch, which often did not fit well with the structure of the existing
>   code.
> 
> An additional criticism was that the shape of the intrinsic did not
> fit particularly well with systems that used a single speculation
> barrier that essentially had to wait until all preceding speculation
> had to be resolved.
Right.  I suggest the Intel and IBM reps chime in on the updated semantics.

> 
> To address all of the above, these patches adopt a new approach, based
> in part on a posting by Chandler Carruth to the LLVM developers list
> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
> but which we have extended to deal with inter-function speculation.
> The patches divide the problem into two halves.
We're essentially turning the control dependency into a value that we
can then use to munge the pointer or the resultant data.

> 
> The first half is some target-specific code to track the speculation
> condition through the generated code to provide an internal variable
> which can tell us whether or not the CPU's control flow speculation
> matches the data flow calculations.  The idea is that the internal
> variable starts with the value TRUE and if the CPU's control flow
> speculation ever causes a jump to the wrong block of code the variable
> becomes false until such time as the incorrect control flow
> speculation gets unwound.
Right.

So one of the things that comes immediately to mind is you have to run
this early enough that you can still get to all the control flow and
build your predicates.  Otherwise you have do undo stuff like
conditional move generation.

On the flip side, the earlier you do this mitigation, the more you have
to worry about what the optimizers are going to do to the code later in
the pipeline.  It's almost guaranteed a naive implementation is going to
muck this up since we can propagate the state of the condition into the
arms which will make the predicate state a compile time constant.

In fact this seems to be running into the area of pointer providence and
some discussions we had around atomic a few years back.

I also wonder if this could be combined with taint analysis to produce a
much lower overhead solution in cases were developers have done analysis
and know what objects are potentially under attacker control.  So
instead of analyzing everything, we can have a much narrower focus.

The pointer munging could well run afoul of alias analysis engines that
don't expect to be seeing those kind of operations.

Anyway, just some initial high level thoughts.  I'm sure there'll be
more as I read the implementation.


Jeff


[PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-09 Thread Richard Earnshaw

The patches I posted earlier this year for mitigating against
CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
which it became obvious that a rethink was needed.  This mail, and the
following patches attempt to address that feedback and present a new
approach to mitigating against this form of attack surface.

There were two major issues with the original approach:

- The speculation bounds were too tightly constrained - essentially
  they had to represent and upper and lower bound on a pointer, or a
  pointer offset.
- The speculation constraints could only cover the immediately preceding
  branch, which often did not fit well with the structure of the existing
  code.

An additional criticism was that the shape of the intrinsic did not
fit particularly well with systems that used a single speculation
barrier that essentially had to wait until all preceding speculation
had to be resolved.

To address all of the above, these patches adopt a new approach, based
in part on a posting by Chandler Carruth to the LLVM developers list
(https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
but which we have extended to deal with inter-function speculation.
The patches divide the problem into two halves.

The first half is some target-specific code to track the speculation
condition through the generated code to provide an internal variable
which can tell us whether or not the CPU's control flow speculation
matches the data flow calculations.  The idea is that the internal
variable starts with the value TRUE and if the CPU's control flow
speculation ever causes a jump to the wrong block of code the variable
becomes false until such time as the incorrect control flow
speculation gets unwound.

The second half is that a new intrinsic function is introduced that is
much simpler than we had before.  The basic version of the intrinsic
is now simply:

  T var = __builtin_speculation_safe_value (T unsafe_var);

Full details of the syntax can be found in the documentation patch, in
patch 1.  In summary, when not speculating the intrinsic returns
unsafe_var; when speculating then if it can be shown that the
speculative flow has diverged from the intended control flow then zero
is returned.  An optional second argument can be used to return an
alternative value to zero.  The builtin may cause execution to pause
until the speculation state is resolved.

There are seven patches in this set, as follows.

1) Introduces the new intrinsic __builtin_sepculation_safe_value.
2) Adds a basic hard barrier implementation for AArch32 (arm) state.
3) Adds a basic hard barrier implementation for AArch64 state.
4) Adds a new command-line option -mtrack-speculation (currently a no-op).
5) Disables CB[N]Z and TB[N]Z when -mtrack-speculation.
6) Adds the new speculation tracking pass for AArch64
7) Uses the new speculation tracking pass to generate CSDB-based barrier
   sequences

I haven't added a speculation-tracking pass for AArch32 at this time.
It is possible to do this, but would require quite a lot of rework for
the arm backend due to the limited number of registers that are
available.

Although patch 6 is AArch64 specific, I'd appreciate a review from
someone more familiar with the branch edge code than myself.  There
appear to be a number of tricky issues with more complex edges so I'd
like a second opinion on that code in case I've missed an important
case.

R.

  

Richard Earnshaw (7):
  Add __builtin_speculation_safe_value
  Arm - add speculation_barrier pattern
  AArch64 - add speculation barrier
  AArch64 - Add new option -mtrack-speculation
  AArch64 - disable CB[N]Z TB[N]Z when tracking speculation
  AArch64 - new pass to add conditional-branch speculation tracking
  AArch64 - use CSDB based sequences if speculation tracking is enabled

 gcc/builtin-types.def |   6 +
 gcc/builtins.c|  57 
 gcc/builtins.def  |  20 ++
 gcc/c-family/c-common.c   | 143 +
 gcc/c-family/c-cppbuiltin.c   |   5 +-
 gcc/config.gcc|   2 +-
 gcc/config/aarch64/aarch64-passes.def |   1 +
 gcc/config/aarch64/aarch64-protos.h   |   3 +-
 gcc/config/aarch64/aarch64-speculation.cc | 494 ++
 gcc/config/aarch64/aarch64.c  |  88 +-
 gcc/config/aarch64/aarch64.md | 140 -
 gcc/config/aarch64/aarch64.opt|   4 +
 gcc/config/aarch64/iterators.md   |   3 +
 gcc/config/aarch64/t-aarch64  |  10 +
 gcc/config/arm/arm.md |  21 ++
 gcc/config/arm/unspecs.md |   1 +
 gcc/doc/cpp.texi  |   4 +
 gcc/doc/extend.texi   |  29 ++
 gcc/doc/invoke.texi   |  10 +-
 gcc/doc/md.texi   |  15 +
 gcc/doc/tm.texi   |  20 ++
 gcc/doc/tm.texi.in