From: Dmitry Vyukov <dvyu...@google.com> Date: Thu, 14 Jan 2021 13:51:44 +0100
> On Thu, Jan 14, 2021 at 1:50 PM Dmitry Vyukov <dvyu...@google.com> wrote: >> >> On Thu, Jan 14, 2021 at 1:44 PM Alexander Lobakin <aloba...@pm.me> wrote: >>> >>> From: Dmitry Vyukov <dvyu...@google.com> >>> Date: Thu, 14 Jan 2021 12:47:31 +0100 >>> >>>> On Thu, Jan 14, 2021 at 12:41 PM Alexander Lobakin <aloba...@pm.me> wrote: >>>>> >>>>> From: Eric Dumazet <eduma...@google.com> >>>>> Date: Wed, 13 Jan 2021 15:36:05 +0100 >>>>> >>>>>> On Wed, Jan 13, 2021 at 2:37 PM Alexander Lobakin <aloba...@pm.me> wrote: >>>>>>> >>>>>>> Instead of calling kmem_cache_alloc() every time when building a NAPI >>>>>>> skb, (re)use skbuff_heads from napi_alloc_cache.skb_cache. Previously >>>>>>> this cache was only used for bulk-freeing skbuff_heads consumed via >>>>>>> napi_consume_skb() or __kfree_skb_defer(). >>>>>>> >>>>>>> Typical path is: >>>>>>> - skb is queued for freeing from driver or stack, its skbuff_head >>>>>>> goes into the cache instead of immediate freeing; >>>>>>> - driver or stack requests NAPI skb allocation, an skbuff_head is >>>>>>> taken from the cache instead of allocation. >>>>>>> >>>>>>> Corner cases: >>>>>>> - if it's empty on skb allocation, bulk-allocate the first half; >>>>>>> - if it's full on skb consuming, bulk-wipe the second half. >>>>>>> >>>>>>> Also try to balance its size after completing network softirqs >>>>>>> (__kfree_skb_flush()). >>>>>> >>>>>> I do not see the point of doing this rebalance (especially if we do not >>>>>> change >>>>>> its name describing its purpose more accurately). >>>>>> >>>>>> For moderate load, we will have a reduced bulk size (typically one or >>>>>> two). >>>>>> Number of skbs in the cache is in [0, 64[ , there is really no risk of >>>>>> letting skbs there for a long period of time. >>>>>> (32 * sizeof(sk_buff) = 8192) >>>>>> I would personally get rid of this function completely. >>>>> >>>>> When I had a cache of 128 entries, I had worse results without this >>>>> function. But seems like I forgot to retest when I switched to the >>>>> original size of 64. >>>>> I also thought about removing this function entirely, will test. >>>>> >>>>>> Also it seems you missed my KASAN support request ? >>>>> I guess this is a matter of using kasan_unpoison_range(), we can ask for >>>>> help. >>>>> >>>>> I saw your request, but don't see a reason for doing this. >>>>> We are not caching already freed skbuff_heads. They don't get >>>>> kmem_cache_freed before getting into local cache. KASAN poisons >>>>> them no earlier than at kmem_cache_free() (or did I miss someting?). >>>>> heads being cached just get rid of all references and at the moment >>>>> of dropping to the cache they are pretty the same as if they were >>>>> allocated. >>>> >>>> KASAN should not report false positives in this case. >>>> But I think Eric meant preventing false negatives. If we kmalloc 17 >>>> bytes, KASAN will detect out-of-bounds accesses beyond these 17 bytes. >>>> But we put that data into 128-byte blocks, KASAN will miss >>>> out-of-bounds accesses beyond 17 bytes up to 128 bytes. >>>> The same holds for "logical" use-after-frees when object is free, but >>>> not freed into slab. >>>> >>>> An important custom cache should use annotations like >>>> kasan_poison_object_data/kasan_unpoison_range. >>> >>> As I understand, I should >>> kasan_poison_object_data(skbuff_head_cache, skb) and then >>> kasan_unpoison_range(skb, sizeof(*skb)) when putting it into the >>> cache? >> >> I think it's the other way around. It should be _un_poisoned when used. >> If it's fixed size, then unpoison_object_data should be a better fit: >> https://elixir.bootlin.com/linux/v5.11-rc3/source/mm/kasan/common.c#L253 > > Variable-size poisoning/unpoisoning would be needed for the skb data itself: > https://bugzilla.kernel.org/show_bug.cgi?id=199055 This cache is for skbuff_heads only, not for the entire skbs. All linear data and frags gets freed before head hits the cache. The cache will store skbuff_heads as if they were freshly allocated by kmem_cache_alloc(). Al