On Thu, Feb 05, 2026 at 05:54:08PM -0800, Jakub Kicinski wrote:
> On Thu, 5 Feb 2026 15:40:46 +0200 Vladimir Oltean wrote:
> > > > I mean, it should "work" given the caveat that calling 
> > > > bpf_xdp_adjust_tail()
> > > > on a first-half page buffer with a large offset risks leaking into the
> > > > second half, which may also be in use, and this will go undetected, 
> > > > right?
> > > > Although the practical chances of that happening are low, the requested
> > > > offset needs to be in the order of hundreds still.  
> > > 
> > > Oh, I did get carried away there...
> > > Well, one thing is shared page memory model in enetc and i40e, another 
> > > thing is
> > > xsk_buff_pool, where chunk size can be between 2K and PAGE_SIZE. What 
> > > about
> > > 
> > > tailroom = rxq->frag_size - skb_frag_size(frag) -
> > >            (skb_frag_off(frag) % rxq->frag_size);
> > > 
> > > When frag_size is set to 2K, headroom is let's say 256, so aligned DMA 
> > > write
> > > size is 1420.
> > > last frag at the start of the page: offset=256, size<=1420
> > >     tailroom >= 2K - 1420 - 256 = 372
> > > last frag in the middle of the page: offset=256+2K, size<=1420
> > >     tailroom >= 2K - 1420 - ((256 + 2K) % 2K) = 372
> > > 
> > > And for drivers that do not fragment pages for multi-buffer packets, 
> > > nothing
> > > changes, since offset is always less than rxq->frag_size.
> > > 
> > > This brings us back to rxq->frag_size being half of a page for enetc and 
> > > i40e,
> > > and seems like in ZC mode it should be pool->chunk_size to work properly. 
> > >  
> > 
> > With skb_frag_off() taken into account modulo 2K for the tailroom
> > calculation, I can confirm bpf_xdp_frags_increase_tail() works well for
> > ENETC. I haven't fully considered the side effects, though.
> 
> +1, also seems to me like it would work tho I haven't thought thru all 
> the cases. We do need to document and name things well, tho, 'cause
> subtleties are piling up ;) Maybe it's time for an ASCII art
> for xdp layout?
>

Yeah, for AF_XDP mbuf in i40e we actually recently discovered another 
buffer-size-calculation-related issue, so some visual aid would be useful. I 
will think about how it should look.
 
> FWIW my feeling is that instead of nickel and diming leftover space 
> in the frags if someone actually cared about growing mbufs we should
> have the helper allocate a new page from the PP and append it to the
> shinfo. Much simpler, "infinite space", and works regardless of the
> driver. I don't mean that to suggest you implement it, purely to point
> out that I think nobody really uses positive offsets.. So we can as
> well switch more complicated drivers back to xdp_rxq_info_reg().
> 

As Vladimir has mentioned, if the driver does not use header split, frags will 
have a tailroom of a size of skb_shared_info, so tail growing does work in 
practice.

Allocating a page_pool buffer (given XDP queue has one attached) is certainly 
an 
option, although I am not sure if anyone needs it. Furthermore, growing tail 
would still fail for a single-buf case.

Reply via email to