================ @@ -0,0 +1,844 @@ +================================================== +``-fbounds-safety``: Enforcing bounds safety for C +================================================== + +.. contents:: + :local: + +Overview +======== + +``-fbounds-safety`` is a C extension to enforce bounds safety to prevent +out-of-bounds (OOB) memory accesses, which remain a major source of security +vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs +by turning OOB accesses into deterministic traps. + +The ``-fbounds-safety`` extension offers bounds annotations that programmers can +use to attach bounds to pointers. For example, programmers can add the +``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the pointer +has ``N`` valid elements: + +.. code-block:: c + + void foo(int *__counted_by(N) ptr, size_t N); + +Using this bounds information, the compiler inserts bounds checks on every +pointer dereference, ensuring that the program does not access memory outside +the specified bounds. The compiler requires programmers to provide enough bounds +information so that the accesses can be checked at either run time or compile +time — and it rejects code if it cannot. + +The most important contribution of ``-fbounds-safety`` is how it reduces the +programmer’s annotation burden by reconciling bounds annotations at ABI +boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that +carry bounds information on local variables without the need for annotations. We +designed this model so that it preserves ABI compatibility with C while +minimizing adoption effort. + +The ``-fbounds-safety`` extension has been adopted on millions of lines of +production C code and proven to work in a consumer operating system setting. The +extension was designed to enable incremental adoption — a key requirement in +real-world settings where modifying an entire project and its dependencies all +at once is often not possible. It also addresses multiple of other practical +challenges that have made existing approaches to safer C dialects difficult to +adopt, offering these properties that make it widely adoptable in practice: + +* It is designed to preserve the Application Binary Interface (ABI). +* It interoperates well with plain C code. +* It can be adopted partially and incrementally while still providing safety + benefits. +* It is a conforming extension to C. +* Consequently, source code that adopts the extension can continue to be + compiled by toolchains that do not support the extension (CAVEAT: this still + requires inclusion of a header file micro-defining bounds annotations to + empty). +* It has a relatively low adoption cost. + +This document discusses the key designs of ``-fbounds-safety``. The document is +subject to be actively updated with a more detailed specification. The +implementation plan can be found in Implementation plans for -fbounds-safety. + +.. Cross reference doesn't currently work + `Implementation plans for -fbounds-safety <BoundsSafetyImplPlans.rst>`_. + +Programming Model +================= + +Overview +-------- + +``-fbounds-safety`` ensures that pointers are not used to access memory beyond +their bounds by performing bounds checking. If a bounds check fails, the program +will deterministically trap before out-of-bounds memory is accessed. + +In our model, every pointer has an explicit or implicit bounds attribute that +determines its bounds and ensures guaranteed bounds checking. Consider the +example below where the ``__counted_by(count)`` annotation indicates that +parameter ``p`` points to a buffer of integers containing ``count`` elements. An +off-by-one error is present in the loop condition, leading to ``p[i]`` being +out-of-bounds access during the loop’s final iteration. The compiler inserts a +bounds check before ``p`` is dereferenced to ensure that the access remains +within the specified bounds. + +.. code-block:: c + + void fill_array_with_indices(int *__counted_by(count) p, unsigned count) { + // off-by-one error (i < count) + for (unsigned i = 0; i <= count; ++i) { + // bounds check inserted: + // if (i >= count) trap(); + p[i] = i; + } + } + +A bounds annotation defines an invariant for the pointer type, and the model +ensures that this invariant remains true. In the example below, pointer ``p`` +annotated with ``__counted_by(count)`` must always point to a memory buffer +containing at least ``count`` elements of the pointee type. Changing the value +of ``count``, like in the example below, may violate this invariant and permit +out-of-bounds access to the pointer. To avoid this, the compiler employs +compile-time restrictions and emits run-time checks as necessary to ensure the +new count value doesn't exceed the actual length of the buffer. Section +`Maintaining correctness of bounds annotations`_ provides more details about +this programming model. + +.. code-block:: c + + int g; + + void foo(int *__counted_by(count) p, size_t count) { + count++; // may violate the invariant of __counted_by + count--; // may violate the invariant of __counted_by if count was 0. + count = g; // may violate the invariant of __counted_by + // depending on the value of `g`. + } + +The requirement to annotate all pointers with explicit bounds information could +present a significant adoption burden. To tackle this issue, the model +incorporates the concept of a “wide pointer” (a.k.a. fat pointer) – a larger +pointer that carries bounds information alongside the pointer value. Utilizing +wide pointers can potentially reduce the adoption burden, as it contains bounds +information internally and eliminates the need for explicit bounds annotations. +However, wide pointers differ from standard C pointers in their data layout, +which may result in incompatibilities with the application binary interface +(ABI). Breaking the ABI complicates interoperability with external code that has +not adopted the same programming model. + +``-fbounds-safety`` harmonizes the wide pointer and the bounds annotation +approaches to reduce the adoption burden while maintaining the ABI. In this +model, local variables of pointer type are implicitly treated as wide pointers, +allowing them to carry bounds information without requiring explicit bounds +annotations. This approach does not impact the ABI, as local variables are +hidden from the ABI. Pointers associated with any other variables are treated as +single object pointers (i.e., ``__single``), ensuring that they always have the +tightest bounds by default and offering a strong bounds safety guarantee. + +By implementing default bounds annotations based on ABI visibility, a +considerable portion of C code can operate without modifications within this +programming model, reducing the adoption burden. + +The rest of the section will discuss individual bounds annotations and the +programming model in more detail. + +Bounds annotations +------------------ + +Annotation for pointers to a single object +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The C language allows pointer arithmetic on arbitrary pointers and this has been +a source of many bounds safety issues. In practice, many pointers are merely +pointing to a single object and incrementing or decrementing such a pointer +immediately makes the pointer go out-of-bounds. To prevent this unsafety, +``-fbounds-safety`` provides the annotation ``__single`` that causes pointer +arithmetic on annotated pointers to be a compile time error. + +* ``__single`` : indicates that the pointer is either pointing to a single + object or null. Hence, pointers with ``__single`` do not permit pointer + arithmetic nor being subscripted with a non-zero index. Dereferencing a + ``__single`` pointer is allowed but it requires a null check. Upper and lower + bounds checks are not required because the ``__single`` pointer should point + to a valid object unless it’s null. + +We use ``__single`` as the default annotation for ABI-visible pointers. This ---------------- AaronBallman wrote:
```suggestion ``__single`` is the default annotation for ABI-visible pointers. This ``` https://github.com/llvm/llvm-project/pull/70749 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits