On Thu, Jun 04, 2026 at 02:30:34PM +0900, Harry Yoo wrote:
> 
> 
> On 6/3/26 3:31 AM, Pedro Falcato wrote:
> > SKB data area allocations (as done from alloc_skb()) use kmalloc().
> > These allocations can be variably sized and their contents can be more
> > or less controlled from userspace, which makes them useful for attackers
> > that want to overwrite a use-after-free'd object from the same kmalloc slab
> > (which often just requires the sizes to roughly match into the same kmalloc
> > bucket). [0] is an easy example of an exploit that uses netlink skb
> > allocation to target another similarly-sized accidentally freed object.
> > 
> > While other mitigations like CONFIG_RANDOM_KMALLOC_CACHES exist, these are
> > probabilistic. Use the existing kmem buckets API to further isolate these
> > allocations in a guaranteed fashion, when CONFIG_SLAB_BUCKETS=y.
> > 
> > Link: 
> > https://github.com/google/security-research/blob/master/pocs/linux/kernelctf/CVE-2023-4207_lts_cos_mitigation_2/docs/exploit.md
> >  [0]
> > Signed-off-by: Pedro Falcato <[email protected]>
> > ---
> >  net/core/skbuff.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 44a7f8401468..1f6c6b531ece 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -594,6 +594,8 @@ static void *kmalloc_pfmemalloc(size_t obj_size, gfp_t 
> > flags, int node)
> >     return kmalloc_node_track_caller(obj_size, flags, node);
> >  }
> >  
> > +static kmem_buckets *skb_data_buckets __ro_after_init;
> > +
> >  /*
> >   * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells
> >   * the caller if emergency pfmemalloc reserves are being used. If it is and
> > @@ -632,7 +634,7 @@ static void *kmalloc_reserve(unsigned int *size, gfp_t 
> > flags, int node,
> >      * Try a regular allocation, when that fails and we're not entitled
> >      * to the reserves, fail.
> >      */
> > -   obj = kmalloc_node_track_caller(obj_size,
> > +   obj = kmem_buckets_alloc_node_track_caller(skb_data_buckets, obj_size,
> >                                     flags | __GFP_NOMEMALLOC | __GFP_NOWARN,
> >                                     node);
> >     if (likely(obj))
> 
> What about kmalloc_pfmemalloc()?

Good point, that looks free as well.

Sidenote: isolating kmem_cache_alloc for possibly-aliasing caches could also
be useful. skb allocation has net_hotdata.skb_small_head_cache. It doesn't merge
with anything for $raisins (odd size, plus I don't think usercopy caches are
getting merged?) but it feels too... accidental?

Maybe passing something like SLAB_NO_MERGE and making the size
standard-looking would be nice. I have a size of 704 bytes per object, and
this probably causes some weird wastage for each slab.


-- 
Pedro

Reply via email to