On Fri, May 30, 2025 at 04:59:35PM +0200, Alexander Lobakin wrote:
> Unfortunately this can hurt performance on my setup.
> In XDP, we usually place &xdp_buff on the stack. It's 56 bytes. We
> initialize it only during the packet processing, not in advance.
> 
> In libeth_xdp, see [1], I provide a small extension for this struct.
> This extension is 64 byte, plus I added a definition (see
> `__LIBETH_XDP_ONSTACK_BUFF()`) to be able to define a bigger one in case
> a driver might need more fields there.
> The same as with &xdp_buff, it shouldn't be initialized in advance, only
> during the packet processing. Otherwise, it can really decrease
> performance, you might've seen recent Mellanox report that even
> CONFIG_INIT_STACK_ZERO provokes this.

FYI, you can use __uninitialized to force CONFIG_INIT_STACK_ZERO to
leave an automatic variable uninitialized.

> What would be the best option to resolve this? This flex XDP buff on the
> stack is fully safe, there are a couple checks that its size does not
> exceed the maximum (defined by xdp_buff_xsk) etc. And we really need to
> initialize it field-by-field in a loop on a per-packet basis -- we are
> sure that there will be no garbage.

But yes, this is suddenly not available for _DEFINE_FLEX after the
referenced patch.

> It's even worse that most drivers will most likely not want to add any
> additional fields, i.e. this flex array at the end will be empty, IOW
> they just want a plain libeth_xdp_buff, but I made a unified definition,
> with which you can declare them on the stack both with and without
> additional fields. So, even if the drivers doesn't want any additional
> fields and the flex array is empty, the struct will be zero-initialized
> and the same perf hit will apply.
> [...]
> [1] 
> https://lore.kernel.org/netdev/[email protected]

Okay, so it sounds like you need the old behavior of _DEFINE_FLEX, *and*
a way to apply attributes (like __uninitialized).

How about replacing _DEFINE_FLEX with:


#define __DEFINE_FLEX(type, name, member, count, trailer...)            \
        _Static_assert(__builtin_constant_p(count),                     \
                       "onstack flex array members require compile-time const 
count"); \
        union {                                                         \
                u8 bytes[struct_size_t(type, member, count)];           \
                type obj;                                               \
        } name##_u trailer;                                             \
        type *name = (type *)&name##_u

#define _DEFINE_FLEX(type, name, member, count, initializer...)         \
        __DEFINE_FLEX(type, name, member, count, = { .obj = initializer })


Which would yield this for ___LIBETH_XDP_ONSTACK_BUFF:


#define ___LIBETH_XDP_ONSTACK_BUFF(name, ...)                           \
        __DEFINE_FLEX(struct libeth_xdp_buff, name, priv,               \
                      LIBETH_XDP_PRIV_SZ(__VA_ARGS__ + 0),              \
                      __uninitialized);                                 \
        LIBETH_XDP_ASSERT_PRIV_SZ(__VA_ARGS__ + 0)


Does that look like what you'd want? (Note I didn't actually build this;
I want to make sure the concept is workable...)

-- 
Kees Cook

Reply via email to