On 5/11/26 22:00, Marco Elver wrote:
> When using CONFIG_KMALLOC_PARTITION_RANDOM, _RET_IP_ was previously used
> to identify the allocation site. _RET_IP_, however, evaluates to the
> caller's parent's instruction pointer rather than the actual allocation
> site; this would lead to collisions where a function performs multiple
> allocations.
>
> With the generalization to kmalloc_token_t, we now generate the token at
> the outermost macro, and using _THIS_IP_ would fix this for all cases.
Hm but it means in patch 1 we make things even worse and then fix them
again, and also improve what was suboptimal prior to the series.
Would it be instead possible to reorder patches 1 and 2 so we improve the
current state first, and then introduce typed partitioning without any
changes to the randomized one? (aside from changing the previously correcly
used cases _RET_IP_ to _CODE_LOCATION_).
> Unfortunately, the generic implementation of _THIS_IP_ relies on taking
> the address of a local label, which is considered broken by both GCC [1]
> and Clang [2] because label addresses are only expected to be used with
> computed gotos. While the generic version more or less works today, it
> is known to be brittle. For example, Clang -O2 always returns 1 when
> this function is inlined:
>
> static inline unsigned long get_ip(void)
> { return ({ __label__ __here; __here: (unsigned long)&&__here; }); }
>
> To provide a reliable unique identifier without breaking architectures
> relying on the generic _THIS_IP_, introduce _CODE_LOCATION_: it resolves
> to _THIS_IP_ where architectures provide a safe implementation, and
> falls back to a zero-cost static marker where _THIS_IP_ is broken.
>
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 [1]
> Link: https://github.com/llvm/llvm-project/issues/138272 [2]
> Signed-off-by: Marco Elver <[email protected]>
> ---
> v4:
> * New patch.
> ---
> include/linux/instruction_pointer.h | 24 ++++++++++++++++++++++++
> include/linux/slab.h | 2 +-
> 2 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/instruction_pointer.h
> b/include/linux/instruction_pointer.h
> index aa0b3ffea935..ea5bc756bd99 100644
> --- a/include/linux/instruction_pointer.h
> +++ b/include/linux/instruction_pointer.h
> @@ -8,6 +8,30 @@
>
> #ifndef _THIS_IP_
> #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
> +/*
> + * The current generic definition of _THIS_IP_ is considered broken by GCC
> [1]
> + * and Clang [2]. In particular, the address of a label is only expected to
> be
> + * used with a computed goto.
> + *
> + * [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071
> + * [2] https://github.com/llvm/llvm-project/issues/138272
> + *
> + * Mark it as broken, so that appropriate fallback options can be implemented
> + * for architectures that do not define their own _THIS_IP_.
> + */
> +#define HAS_BROKEN_THIS_IP
> +#endif
> +
> +/*
> + * _CODE_LOCATION_ provides a unique identifier for the current code
> location.
> + * When _THIS_IP_ is broken (generic version), we fall back to a static
> marker
> + * which guarantees uniqueness and resolves to a constant address at link
> time,
> + * avoiding runtime overhead and compiler optimizations breaking it.
> + */
> +#ifdef HAS_BROKEN_THIS_IP
> +#define _CODE_LOCATION_ ({ static const char __here; (unsigned long)&__here;
> })
> +#else
> +#define _CODE_LOCATION_ _THIS_IP_
> #endif
>
> #endif /* _LINUX_INSTRUCTION_POINTER_H */
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index c232f8a10af6..efab6b2ccf21 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -503,7 +503,7 @@ int kmem_cache_shrink(struct kmem_cache *s);
> typedef struct { unsigned long v; } kmalloc_token_t;
> #ifdef CONFIG_KMALLOC_PARTITION_RANDOM
> extern unsigned long random_kmalloc_seed;
> -#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _RET_IP_ })
> +#define __kmalloc_token(...) ((kmalloc_token_t){ .v = _CODE_LOCATION_ })
> #elif defined(CONFIG_KMALLOC_PARTITION_TYPED)
> #define __kmalloc_token(...) ((kmalloc_token_t){ .v =
> __builtin_infer_alloc_token(__VA_ARGS__) })
> #endif