On Fri, Jan 5, 2018 at 11:40 AM, Richard Earnshaw (lists) <richard.earns...@arm.com> wrote: > On 05/01/18 09:44, Richard Biener wrote: >> On Thu, Jan 4, 2018 at 2:58 PM, Richard Earnshaw >> <richard.earns...@arm.com> wrote: >>> >>> Recently, Google Project Zero disclosed several classes of attack >>> against speculative execution. One of these, known as variant-1 >>> (CVE-2017-5753), allows explicit bounds checks to be bypassed under >>> speculation, providing an arbitrary read gadget. Further details can >>> be found on the GPZ blog [1] and the documentation that is included >>> with the first patch. >>> >>> This patch set adds a new builtin function for GCC to provide a >>> mechanism for limiting speculation by a CPU after a bounds-checked >>> memory access. I've tried to design this in such a way that it can be >>> used for any target where this might be necessary. The patch set >>> provides a generic implementation of the builtin and then >>> target-specific support for Arm and AArch64. Other architectures can >>> utilize the internal infrastructure as needed. >>> >>> Most of the details of the builtin and the hooks that need to be >>> implemented to support it are described in the updates to the manual, >>> but a short summary is given below. >>> >>> TYP __builtin_load_no_speculate >>> (const volatile TYP *ptr, >>> const volatile void *lower, >>> const volatile void *upper, >>> TYP failval, >>> const volatile void *cmpptr) >>> >>> Where TYP can be any integral type (signed or unsigned char, int, >>> short, long, etc) or any pointer type. >>> >>> The builtin implements the following logical behaviour: >>> >>> inline TYP __builtin_load_no_speculate >>> (const volatile TYP *ptr, >>> const volatile void *lower, >>> const volatile void *upper, >>> TYP failval, >>> const volatile void *cmpptr) >>> { >>> TYP result; >>> >>> if (cmpptr >= lower && cmpptr < upper) >>> result = *ptr; >>> else >>> result = failval; >>> return result; >>> } >>> >>> in addition the specification of the builtin ensures that future >>> speculation using *ptr may only continue iff cmpptr lies within the >>> bounds specified. >> >> I fail to see how the vulnerability doesn't affect aggregate copies >> or bitfield accesses. The vulnerability doesn't cause the loaded >> value to leak but (part of) the address by populating the CPU cache >> side-channel. >> > > It's not quite as simple as that. You'll need to read the white paper > on Arm's web site to get a full grasp of this (linked from > https://www.arm.com/security-update). > >> So why isn't this just >> >> T __builtin_load_no_speculate (T *); >> >> ? Why that "fallback" and why the lower/upper bounds? > > Because, for Arm, we need to use both CSEL and CSDB (AArch64) in tandem > to restrict subsequent speculation, the CSEL needs a condition and the > compiler must not be able to optimize it away based on logical > reachability. The fallback is used for the other operand of the CSEL > instruction.
But the condition could be just 'true' in the instruction encoding? That is, why do you do both the jump-around and the csel? Maybe I misunderstood the RTL you appear to generate. I'd have expected an UNSPEC to avoid the compiler messing up things. >> >> Did you talk with other compiler vendors (intel, llvm, ...?)? > > As stated we'll shortly be submitting similar patches for LLVM. How do you solve the aggregate load issue? I would imagine that speculating stores (by pre-fetching the affected cache line!) could be another attack vector? Not sure if any CPU speculatively does that (but I see no good reason why a CPU shouldn't do that). Does ARM have a barrier like instruction suitable for use in the kernel patches that are floating around? Richard. > R. > >> >> Richard. >> >>> Some optimizations are permitted to make the builtin easier to use. >>> The final two arguments can both be omitted (c++ style): failval will >>> default to 0 in this case and if cmpptr is omitted ptr will be used >>> for expansions of the range check. In addition either lower or upper >>> (but not both) may be a literal NULL and the expansion will then >>> ignore that boundary condition when expanding. >>> >>> The patch set is constructed as follows: >>> 1 - generic modifications to GCC providing the builtin function for all >>> architectures and expanding to an implementation that gives the >>> logical behaviour of the builtin only. A warning is generated if >>> this expansion path is used that code will execute correctly but >>> without providing protection against speculative use. >>> 2 - AArch64 support >>> 3 - AArch32 support (arm) for A32 and thumb2 states. >>> >>> These patches can be used with the header file that Arm recently >>> published here: https://github.com/ARM-software/speculation-barrier. >>> >>> Kernel patches are also being developed, eg: >>> https://lkml.org/lkml/2018/1/3/754. The intent is that eventually >>> code like this will be able to use support directly from the compiler >>> in a portable manner. >>> >>> Similar patches are also being developed for LLVM and will be posted >>> to their development lists shortly. >>> >>> [1] More information on the topic can be found here: >>> https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html >>> Arm specific information can be found here: >>> https://www.arm.com/security-update >>> >>> >>> >>> Richard Earnshaw (3): >>> [builtins] Generic support for __builtin_load_no_speculate() >>> [aarch64] Implement support for __builtin_load_no_speculate. >>> [arm] Implement support for the de-speculation intrinsic >>> >>> gcc/builtin-types.def | 16 +++++ >>> gcc/builtins.c | 99 +++++++++++++++++++++++++ >>> gcc/builtins.def | 22 ++++++ >>> gcc/c-family/c-common.c | 164 >>> ++++++++++++++++++++++++++++++++++++++++++ >>> gcc/c-family/c-cppbuiltin.c | 5 +- >>> gcc/config/aarch64/aarch64.c | 92 ++++++++++++++++++++++++ >>> gcc/config/aarch64/aarch64.md | 28 ++++++++ >>> gcc/config/arm/arm.c | 107 +++++++++++++++++++++++++++ >>> gcc/config/arm/arm.md | 40 ++++++++++- >>> gcc/config/arm/unspecs.md | 1 + >>> gcc/doc/cpp.texi | 4 ++ >>> gcc/doc/extend.texi | 53 ++++++++++++++ >>> gcc/doc/tm.texi | 6 ++ >>> gcc/doc/tm.texi.in | 2 + >>> gcc/target.def | 20 ++++++ >>> gcc/targhooks.c | 69 ++++++++++++++++++ >>> gcc/targhooks.h | 3 + >>> 17 files changed, 729 insertions(+), 2 deletions(-) >>> >>> >>> >