On Thu, 5 Feb 2026 15:40:46 +0200 Vladimir Oltean wrote:
> > > I mean, it should "work" given the caveat that calling 
> > > bpf_xdp_adjust_tail()
> > > on a first-half page buffer with a large offset risks leaking into the
> > > second half, which may also be in use, and this will go undetected, right?
> > > Although the practical chances of that happening are low, the requested
> > > offset needs to be in the order of hundreds still.  
> > 
> > Oh, I did get carried away there...
> > Well, one thing is shared page memory model in enetc and i40e, another 
> > thing is
> > xsk_buff_pool, where chunk size can be between 2K and PAGE_SIZE. What about
> > 
> > tailroom = rxq->frag_size - skb_frag_size(frag) -
> >            (skb_frag_off(frag) % rxq->frag_size);
> > 
> > When frag_size is set to 2K, headroom is let's say 256, so aligned DMA write
> > size is 1420.
> > last frag at the start of the page: offset=256, size<=1420
> >     tailroom >= 2K - 1420 - 256 = 372
> > last frag in the middle of the page: offset=256+2K, size<=1420
> >     tailroom >= 2K - 1420 - ((256 + 2K) % 2K) = 372
> > 
> > And for drivers that do not fragment pages for multi-buffer packets, nothing
> > changes, since offset is always less than rxq->frag_size.
> > 
> > This brings us back to rxq->frag_size being half of a page for enetc and 
> > i40e,
> > and seems like in ZC mode it should be pool->chunk_size to work properly.  
> 
> With skb_frag_off() taken into account modulo 2K for the tailroom
> calculation, I can confirm bpf_xdp_frags_increase_tail() works well for
> ENETC. I haven't fully considered the side effects, though.

+1, also seems to me like it would work tho I haven't thought thru all 
the cases. We do need to document and name things well, tho, 'cause
subtleties are piling up ;) Maybe it's time for an ASCII art
for xdp layout?

FWIW my feeling is that instead of nickel and diming leftover space 
in the frags if someone actually cared about growing mbufs we should
have the helper allocate a new page from the PP and append it to the
shinfo. Much simpler, "infinite space", and works regardless of the
driver. I don't mean that to suggest you implement it, purely to point
out that I think nobody really uses positive offsets.. So we can as
well switch more complicated drivers back to xdp_rxq_info_reg().

Reply via email to