llvmbot wrote:

@llvm/pr-subscribers-clang-driver

<details>
<summary>Changes</summary>

This defines the basic set of pointer authentication clang builtins (provided 
in a new header, ptrauth.h), with diagnostics and IRGen support.  The 
availability of the builtins is gated on a new flag, `-fptrauth-intrinsics`.

Note that this only includes the basic intrinsics, and notably excludes 
`ptrauth_sign_constant`, `ptrauth_type_discriminator`, and 
`ptrauth_string_discriminator`, which need extra logic to be fully supported.

This also introduces clang/docs/PointerAuthentication.rst, which describes the 
ptrauth model in general, as well as these builtins.

(Replaces https://reviews.llvm.org/D112941)
--

Patch is 93.58 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/65996.diff

30 Files Affected:

- (modified) clang/docs/LanguageExtensions.rst (+5) 
- (added) clang/docs/PointerAuthentication.rst (+548) 
- (modified) clang/include/clang/Basic/Builtins.def (+8) 
- (modified) clang/include/clang/Basic/DiagnosticGroups.td (+1) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+16) 
- (modified) clang/include/clang/Basic/Features.def (+1) 
- (modified) clang/include/clang/Basic/LangOptions.def (+2) 
- (modified) clang/include/clang/Basic/TargetInfo.h (+6) 
- (modified) clang/include/clang/Driver/Options.td (+8) 
- (modified) clang/include/clang/Sema/Sema.h (+2) 
- (modified) clang/lib/Basic/Module.cpp (+4) 
- (modified) clang/lib/Basic/TargetInfo.cpp (+4) 
- (modified) clang/lib/Basic/Targets/AArch64.cpp (+6) 
- (modified) clang/lib/Basic/Targets/AArch64.h (+2) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+67) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5) 
- (modified) clang/lib/Frontend/CompilerInvocation.cpp (+13) 
- (modified) clang/lib/Headers/CMakeLists.txt (+1) 
- (modified) clang/lib/Headers/module.modulemap (+5) 
- (added) clang/lib/Headers/ptrauth.h (+167) 
- (modified) clang/lib/Sema/SemaChecking.cpp (+182) 
- (added) clang/test/CodeGen/ptrauth-intrinsics.c (+73) 
- (added) 
clang/test/Modules/Inputs/ptrauth-include-from-darwin/module.modulemap (+8) 
- (added) clang/test/Modules/Inputs/ptrauth-include-from-darwin/ptrauth.h (+1) 
- (added) clang/test/Modules/Inputs/ptrauth-include-from-darwin/stddef.h (+1) 
- (added) clang/test/Modules/ptrauth-include-from-darwin.m (+6) 
- (added) clang/test/Preprocessor/ptrauth_feature.c (+10) 
- (added) clang/test/Sema/ptrauth-intrinsics-macro.c (+34) 
- (added) clang/test/Sema/ptrauth.c (+126) 
- (modified) llvm/docs/PointerAuth.md (+3) 


<pre>
diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 11cbdca7a268fc3..49a3934d9d082fc 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -13,6 +13,7 @@ Clang Language Extensions
    BlockLanguageSpec
    Block-ABI-Apple
    AutomaticReferenceCounting
+   PointerAuthentication
    MatrixTypes
 
 Introduction
@@ -4157,6 +4158,10 @@ reordering of memory accesses and side effect 
instructions. Other instructions
 like simple arithmetic may be reordered around the intrinsic. If you expect to
 have no reordering at all, use inline assembly instead.
 
+Pointer Authentication
+^^^^^^^^^^^^^^^^^^^^^^
+See :doc:`PointerAuthentication`.
+
 X86/X86-64 Language Extensions
 ------------------------------
 
diff --git a/clang/docs/PointerAuthentication.rst 
b/clang/docs/PointerAuthentication.rst
new file mode 100644
index 000000000000000..87b8f244a2e4653
--- /dev/null
+++ b/clang/docs/PointerAuthentication.rst
@@ -0,0 +1,548 @@
+Pointer Authentication
+======================
+
+.. contents::
+   :local:
+
+Introduction
+------------
+
+Pointer authentication is a technology which offers strong probabilistic 
protection against exploiting a broad class of memory bugs to take control of 
program execution.  When adopted consistently in a language ABI, it provides a 
form of relatively fine-grained control flow integrity (CFI) check that resists 
both return-oriented programming (ROP) and jump-oriented programming (JOP) 
attacks.
+
+While pointer authentication can be implemented purely in software, direct 
hardware support (e.g. as provided by ARMv8.3) can dramatically lower the 
execution speed and code size costs.  Similarly, while pointer authentication 
can be implemented on any architecture, taking advantage of the (typically) 
excess addressing range of a target with 64-bit pointers minimizes the impact 
on memory performance and can allow interoperation with existing code (by 
disabling pointer authentication dynamically).  This document will generally 
attempt to present the pointer authentication feature independent of any 
hardware implementation or ABI.  Considerations that are 
implementation-specific are clearly identified throughout.
+
+Note that there are several different terms in use:
+
+- **Pointer authentication** is a target-independent language technology.
+
+- **ARMv8.3** is an AArch64 architecture revision of that provides hardware 
support for pointer authentication.  It is implemented on several shipping 
processors, including the Apple A12 and later.
+
+* **arm64e** is a specific ABI for (not yet fully stable) for implementing 
pointer authentication on ARMv8.3 on certain Apple operating systems.
+
+This document serves four purposes:
+
+- It describes the basic ideas of pointer authentication.
+
+- It documents several language extensions that are useful on targets using 
pointer authentication.
+
+- It presents a theory of operation for the security mitigation, describing 
the basic requirements for correctness, various weaknesses in the mechanism, 
and ways in which programmers can strengthen its protections (including 
recommendations for language implementors).
+
+- It will eventually document the language ABIs currently used for C, C++, 
Objective-C, and Swift on arm64e, although these are not yet stable on any 
target.
+
+Basic Concepts
+--------------
+
+The simple address of an object or function is a **raw pointer**.  A raw 
pointer can be **signed** to produce a **signed pointer**.  A signed pointer 
can be then **authenticated** in order to verify that it was **validly signed** 
and extract the original raw pointer.  These terms reflect the most likely 
implementation technique: computing and storing a cryptographic signature along 
with the pointer.  The security of pointer authentication does not rely on 
attackers not being able to separately overwrite the signature.
+
+An **abstract signing key** is a name which refers to a secret key which can 
used to sign and authenticate pointers.  The key value for a particular name is 
consistent throughout a process.
+
+A **discriminator** is an arbitrary value used to **diversify** signed 
pointers so that one validly-signed pointer cannot simply be copied over 
another.  A discriminator is simply opaque data of some implementation-defined 
size that is included in the signature as a salt.
+
+Nearly all aspects of pointer authentication use just these two primary 
operations:
+
+- ``sign(raw_pointer, key, discriminator)`` produces a signed pointer given a 
raw pointer, an abstract signing key, and a discriminator.
+
+- ``auth(signed_pointer, key, discriminator)`` produces a raw pointer given a 
signed pointer, an abstract signing key, and a discriminator.
+
+``auth(sign(raw_pointer, key, discriminator), key, discriminator)`` must 
succeed and produce ``raw_pointer``.  ``auth`` applied to a value that was 
ultimately produced in any other way is expected to immediately halt the 
program.  However, it is permitted for ``auth`` to fail to detect that a signed 
pointer was not produced in this way, in which case it may return anything; 
this is what makes pointer authentication a probabilistic mitigation rather 
than a perfect one.
+
+There are two secondary operations which are required only to implement 
certain intrinsics in ``<ptrauth.h>``:
+
+- ``strip(signed_pointer, key)`` produces a raw pointer given a signed pointer 
and a key it was presumptively signed with.  This is useful for certain kinds 
of tooling, such as crash backtraces; it should generally not be used in the 
basic language ABI except in very careful ways.
+
+- ``sign_generic(value)`` produces a cryptographic signature for arbitrary 
data, not necessarily a pointer.  This is useful for efficiently verifying that 
non-pointer data has not been tampered with.
+
+Whenever any of these operations is called for, the key value must be known 
statically.  This is because the layout of a signed pointer may vary according 
to the signing key.  (For example, in ARMv8.3, the layout of a signed pointer 
depends on whether TBI is enabled, which can be set independently for code and 
data pointers.)
+
+.. admonition:: Note for API designers and language implementors
+
+  These are the *primitive* operations of pointer authentication, provided for 
clarity of description.  They are not suitable either as high-level interfaces 
or as primitives in a compiler IR because they expose raw pointers.  Raw 
pointers require special attention in the language implementation to avoid the 
accidental creation of exploitable code sequences; see the section on 
`Attackable code sequences`_.
+
+The following details are all implementation-defined:
+
+- the nature of a signed pointer
+- the size of a discriminator
+- the number and nature of the signing keys
+- the implementation of the ``sign``, ``auth``, ``strip``, and 
``sign_generic`` operations
+
+While the use of the terms "sign" and "signed pointer" suggest the use of a 
cryptographic signature, other implementations may be possible.  See 
`Alternative implementations`_ for an exploration of implementation options.
+
+.. admonition:: Implementation example: ARMv8.3
+
+  Readers may find it helpful to know how these terms map to ARMv8.3:
+
+  - A signed pointer is a pointer with a signature stored in the 
otherwise-unused high bits.  The kernel configures the signature width based on 
the system's addressing needs, accounting for whether the AArch64 TBI feature 
is enabled for the kind of pointer (code or data).
+
+  - A discriminator is a 64-bit integer.  Constant discriminators are 16-bit 
integers.  Blending a constant discriminator into an address consists of 
replacing the top 16 bits of the address with the constant.
+
+  - There are five 128-bit signing-key registers, each of which can only be 
directly read or set by privileged code.  Of these, four are used for signing 
pointers, and the fifth is used only for ``sign_generic``.  The key data is 
simply a pepper added to the hash, not an encryption key, and so can be 
initialized using random data.
+
+  - ``sign`` computes a cryptographic hash of the pointer, discriminator, and 
signing key, and stores it in the high bits as the signature. ``auth`` removes 
the signature, computes the same hash, and compares the result with the stored 
signature.  ``strip`` removes the signature without authenticating it.  While 
ARMv8.3's ``aut*`` instructions do not themselves trap on failure, the compiler 
only ever emits them in sequences that will trap.
+
+  - ``sign_generic`` corresponds to the ``pacga`` instruction, which takes two 
64-bit values and produces a 64-bit cryptographic hash. Implementations of this 
instruction may not produce meaningful data in all bits of the result.
+
+Discriminators
+~~~~~~~~~~~~~~
+
+A discriminator is arbitrary extra data which alters the signature on a 
pointer.  When two pointers are signed differently --- either with different 
keys or with different discriminators --- an attacker cannot simply replace one 
pointer with the other.  For more information on why discriminators are 
important and how to use them effectively, see the section on `Substitution 
attacks`_.
+
+To use standard cryptographic terminology, a discriminator acts as a salt in 
the signing of a pointer, and the key data acts as a pepper.  That is, both the 
discriminator and key data are ultimately just added as inputs to the signing 
algorithm along with the pointer, but they serve significantly different roles. 
 The key data is a common secret added to every signature, whereas the 
discriminator is a signing-specific value that can be derived from the 
circumstances of how a pointer is signed.  However, unlike a password salt, 
it's important that discriminators be *independently* derived from the 
circumstances of the signing; they should never simply be stored alongside a 
pointer.
+
+The intrinsic interface in ``<ptrauth.h>`` allows an arbitrary discriminator 
value to be provided, but can only be used when running normal code.  The 
discriminators used by language ABIs must be restricted to make it feasible for 
the loader to sign pointers stored in global memory without needing excessive 
amounts of metadata.  Under these restrictions, a discriminator may consist of 
either or both of the following:
+
+- The address at which the pointer is stored in memory.  A pointer signed with 
a discriminator which incorporates its storage address is said to have 
**address diversity**.  In general, using address diversity means that a 
pointer cannot be reliably replaced by an attacker or used to reliably replace 
a different pointer.  However, an attacker may still be able to attack a larger 
call sequence if they can alter the address through which the pointer is 
accessed.  Furthermore, some situations cannot use address diversity because of 
language or other restrictions.
+
+- A constant integer, called a **constant discriminator**. A pointer signed 
with a non-zero constant discriminator is said to have **constant diversity**.  
If the discriminator is specific to a single declaration, it is said to have 
**declaration diversity**; if the discriminator is specific to a type of value, 
it is said to have **type diversity**.  For example, C++ v-tables on arm64e 
sign their component functions using a hash of their method names and 
signatures, which provides declaration diversity; similarly, C++ member 
function pointers sign their invocation functions using a hash of the member 
pointer type, which provides type diversity.
+
+The implementation may need to restrict constant discriminators to be 
significantly smaller than the full size of a discriminator.  For example, on 
arm64e, constant discriminators are only 16-bit values.  This is believed to 
not significantly weaken the mitigation, since collisions remain uncommon.
+
+The algorithm for blending a constant discriminator with a storage address is 
implementation-defined.
+
+.. _Signing schemas:
+
+Signing schemas
+~~~~~~~~~~~~~~~
+
+Correct use of pointer authentication requires the signing code and the 
authenticating code to agree about the **signing schema** for the pointer:
+
+- the abstract signing key with which the pointer should be signed and
+- an algorithm for computing the discriminator.
+
+As described in the section above on `Discriminators`_, in most situations, 
the discriminator is produced by taking a constant discriminator and optionally 
blending it with the storage address of the pointer.  In these situations, the 
signing schema breaks down even more simply:
+
+- the abstract signing key,
+- a constant discriminator, and
+- whether to use address diversity.
+
+It is important that the signing schema be independently derived at all 
signing and authentication sites.  Preferably, the schema should be hard-coded 
everywhere it is needed, but at the very least, it must not be derived by 
inspecting information stored along with the pointer.  See the section on 
`Attacks on pointer authentication`_ for more information.
+
+Language Features
+-----------------
+
+There is currently one main pointer authentication language feature:
+
+- The language provides the ``<ptrauth.h>`` intrinsic interface for manually 
signing and authenticating pointers in code.  These can be used in 
circumstances where very specific behavior is required.
+
+
+Language extensions
+~~~~~~~~~~~~~~~~~~~
+
+Feature testing
+^^^^^^^^^^^^^^^
+
+Whether the current target uses pointer authentication can be tested for with 
a number of different tests.
+
+- ``__has_feature(ptrauth_intrinsics)`` is true if ``<ptrauth.h>`` provides 
its normal interface.  This may be true even on targets where pointer 
authentication is not enabled by default.
+
+``<ptrauth.h>``
+~~~~~~~~~~~~~~~
+
+This header defines the following types and operations:
+
+``ptrauth_key``
+^^^^^^^^^^^^^^^
+
+This ``enum`` is the type of abstract signing keys.  In addition to defining 
the set of implementation-specific signing keys (for example, ARMv8.3 defines 
``ptrauth_key_asia``), it also defines some portable aliases for those keys.  
For example, ``ptrauth_key_function_pointer`` is the key generally used for C 
function pointers, which will generally be suitable for other function-signing 
schemas.
+
+In all the operation descriptions below, key values must be constant values 
corresponding to one of the implementation-specific abstract signing keys from 
this ``enum``.
+
+``ptrauth_extra_data_t``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+This is a ``typedef`` of a standard integer type of the correct size to hold a 
discriminator value.
+
+In the signing and authentication operation descriptions below, discriminator 
values must have either pointer type or integer type. If the discriminator is 
an integer, it will be coerced to ``ptrauth_extra_data_t``.
+
+``ptrauth_blend_discriminator``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_blend_discriminator(pointer, integer)
+
+Produce a discriminator value which blends information from the given pointer 
and the given integer.
+
+Implementations may ignore some bits from each value, which is to say, the 
blending algorithm may be chosen for speed and convenience over theoretical 
strength as a hash-combining algorithm.  For example, arm64e simply overwrites 
the high 16 bits of the pointer with the low 16 bits of the integer, which can 
be done in a single instruction with an immediate integer.
+
+``pointer`` must have pointer type, and ``integer`` must have integer type. 
The result has type ``ptrauth_extra_data_t``.
+
+``ptrauth_strip``
+^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_strip(signedPointer, key)
+
+Given that ``signedPointer`` matches the layout for signed pointers signed 
with the given key, extract the raw pointer from it.  This operation does not 
trap and cannot fail, even if the pointer is not validly signed.
+
+``ptrauth_sign_unauthenticated``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_sign_unauthenticated(pointer, key, discriminator)
+
+Produce a signed pointer for the given raw pointer without applying any 
authentication or extra treatment.  This operation is not required to have the 
same behavior on a null pointer that the language implementation would.
+
+This is a treacherous operation that can easily result in `signing oracles`_.  
Programs should use it seldom and carefully.
+
+``ptrauth_auth_and_resign``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_auth_and_resign(pointer, oldKey, oldDiscriminator, newKey, 
newDiscriminator)
+
+Authenticate that ``pointer`` is signed with ``oldKey`` and 
``oldDiscriminator`` and then resign the raw-pointer result of that 
authentication with ``newKey`` and ``newDiscriminator``.
+
+``pointer`` must have pointer type.  The result will have the same type as 
``pointer``.  This operation is not required to have the same behavior on a 
null pointer that the language implementation would.
+
+The code sequence produced for this operation must not be directly attackable. 
 However, if the discriminator values are not constant integers, their 
computations may still be attackable.  In the future, Clang should be enhanced 
to guaranteed non-attackability if these expressions are 
:ref:`safely-derived<Safe derivation>`.
+
+``ptrauth_auth_data``
+^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_auth_data(pointer, key, discriminator)
+
+Authenticate that ``pointer`` is signed with ``key`` and ``discriminator`` and 
remove the signature.
+
+``pointer`` must have object pointer type.  The result will have the same type 
as ``pointer``.  This operation is not required to have the same behavior on a 
null pointer that the language implementation would.
+
+In the future when Clang makes `safe derivation`_ guarantees, the result of 
this operation should be considered safely-derived.
+
+``ptrauth_sign_generic_data``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_sign_generic_data(value1, value2)
+
+Computes a signature for the given pair of values, incorporating a secret 
signing key.
+
+This operation can be used to verify that arbitrary data has not be tampered 
with by computing a signature for the data, storing that signature, and then 
repeating this process and verifying that it yields the same result.  This can 
be reasonably done in any number of ways; for example, a library could compute 
an ordinary checksum of the data and just sign the result in order to get the 
tamper-resistance advantages of the secret signing key (since otherwise an 
attacker could reliably overwrite both the data and the checksum).
+
+``value1`` and ``value2`` must be either pointers or integers.  If the 
integers are larger than ``uintptr_t`` then data not representa...
<truncated>
</pre>

</details>

https://github.com/llvm/llvm-project/pull/65996
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to