On 5/18/26 11:08 PM, Marco Elver wrote:
On Fri, 15 May 2026 at 16:28, Pedro Falcato <[email protected]> wrote:

On Mon, May 11, 2026 at 10:00:48PM +0200, Marco Elver wrote:
Rework the general infrastructure around RANDOM_KMALLOC_CACHES into more
flexible KMALLOC_PARTITION_CACHES, with the former being a partitioning
mode of the latter.

Introduce a new mode, KMALLOC_PARTITION_TYPED, which leverages a feature
available in Clang 22 and later, called "allocation tokens" via
__builtin_infer_alloc_token() [1]. Unlike KMALLOC_PARTITION_RANDOM
(formerly RANDOM_KMALLOC_CACHES), this mode deterministically assigns a
slab cache to an allocation of type T, regardless of allocation site.

The builtin __builtin_infer_alloc_token(<malloc-args>, ...) instructs
the compiler to infer an allocation type from arguments commonly passed
to memory-allocating functions and returns a type-derived token ID. The
implementation passes kmalloc-args to the builtin: the compiler performs
best-effort type inference, and then recognizes common patterns such as
`kmalloc(sizeof(T), ...)`, `kmalloc(sizeof(T) * n, ...)`, but also
`(T *)kmalloc(...)`. Where the compiler fails to infer a type the
fallback token (default: 0) is chosen.

Note: kmalloc_obj(..) APIs fix the pattern how size and result type are
expressed, and therefore ensures there's not much drift in which
patterns the compiler needs to recognize. Specifically, kmalloc_obj()
and friends expand to `(TYPE *)KMALLOC(__obj_size, GFP)`, which the
compiler recognizes via the cast to TYPE*.

Clang's default token ID calculation is described as [1]:

    typehashpointersplit: This mode assigns a token ID based on the hash
    of the allocated type's name, where the top half ID-space is reserved
    for types that contain pointers and the bottom half for types that do
    not contain pointers.

Separating pointer-containing objects from pointerless objects and data
allocations can help mitigate certain classes of memory corruption
exploits [2]: attackers who gains a buffer overflow on a primitive
buffer cannot use it to directly corrupt pointers or other critical
metadata in an object residing in a different, isolated heap region.

It is important to note that heap isolation strategies offer a
best-effort approach, and do not provide a 100% security guarantee,
albeit achievable at relatively low performance cost. Note that this
also does not prevent cross-cache attacks: while waiting for future
features like SLAB_VIRTUAL [3] to provide physical page isolation, this
feature should be deployed alongside SHUFFLE_PAGE_ALLOCATOR and
init_on_free=1 to mitigate cross-cache attacks and page-reuse attacks as
much as possible today.

With all that, my kernel (x86 defconfig) shows me a histogram of slab
cache object distribution per /proc/slabinfo (after boot):

   <slab cache>      <objs> <hist>
   kmalloc-part-15    1465  ++++++++++++++
   kmalloc-part-14    2988  +++++++++++++++++++++++++++++
   kmalloc-part-13    1656  ++++++++++++++++
   kmalloc-part-12    1045  ++++++++++
   kmalloc-part-11    1697  ++++++++++++++++
   kmalloc-part-10    1489  ++++++++++++++
   kmalloc-part-09     965  +++++++++
   kmalloc-part-08     710  +++++++
   kmalloc-part-07     100  +
   kmalloc-part-06     217  ++
   kmalloc-part-05     105  +
   kmalloc-part-04    4047  ++++++++++++++++++++++++++++++++++++++++
   kmalloc-part-03     183  +
   kmalloc-part-02     283  ++
   kmalloc-part-01     316  +++
   kmalloc            1422  ++++++++++++++

Hi,

A couple of questions (I apologise if this was asked before, I wasn't involved
in this thread):

1) What's the object behind kmalloc-part-04? I imagine it's a single type
getting allocated a lot?

That's from __kmemdup_nul().

__kmemdup_nul() is probably a good fit for SLAB_BUCKETS?

--
Cheers,
Harry / Hyeonggon


Reply via email to