On Wed, Jun 10, 2026 at 9:21 AM Menglong Dong <[email protected]> wrote:
>
> On 2026/6/10 08:06 Emil Tsalapatis <[email protected]> write:
> > On Tue Jun 9, 2026 at 7:06 AM EDT, Menglong Dong wrote:
> > > On 2026/6/9 18:02 Sun Jian <[email protected]> write:
> > >> dev_map_redirect_clone() uses skb_clone() when redirecting a generic XDP
> > >> skb to multiple devmap destinations. The cloned skb can share packet data
> > >> with other clones.
> > >>
> > >> If the destination devmap entry has an egress XDP program, that program
> > >> can modify packet data. Such modifications can then be observed by other
> > >> clones sharing the same packet data.
> > >>
> > >> This can be reproduced by strengthening xdp_veth_egress to configure a
> > >> different source MAC for each egress device and checking that 
> > >> store_mac_1/2
> > >> observe the MAC configured for their own egress devices. Without the fix,
> > >> the SKB_MODE subtest observes store_mac_1 receiving the MAC configured 
> > >> for
> > >> the next egress device.
> > >>
> > >> Fix this by unsharing the cloned skb before running the devmap egress XDP
> > >> program. Limit the extra copy to destinations with an attached egress
> > >> program.
> > >
> > > Hi, Jian.
> > >
> > > This sounds like a good idea in this case. When I have a look at 
> > > bpf_clone_redirect(),
> > > I found that it use skb_clone() too, which means it has the same problem. 
> > > The
> > > data can be modified by other xdp prog in the destination NIC if we use
> > > bpf_clone_redirect().
> > >
> > > So maybe this is the default logic, and I'm not sure if this patch can 
> > > break the
> > > existing users :/
> >
> > I think for use cases where we are using bpf_clone_redirect() to use
> > one clone for inspection this would add an unnecessary copy. Maybe
> > adding *_copy() variants instead of changing the *_clone() would be
> > better? That way we wouldn't be changing the behavior for existing
> > consumers and the naming would be consistent with the skb_* methods.
>
> Agree. It's not a good idea to change the logic of the existing API. Or
> maybe we can add a BPF_F_CLONE flag for the existing API.
>
> >
> > But more importantly, is there an actual use case for the kind of API
> > that the modified selftest requires? Nobody until now has considered
> > the existing behavior to be a problem.
>
> Agree too. Obviously, this is not a bug. For the use case in the commit
> log, it's something that can be fixed by the user themself. If we need
> modify the MAC, we'd better attach a BPF program for all the egress
> device in the devmap.
>
> >
> > >
> > > Thanks!
> > > Menglong Dong
> > >
> > >>
> > >> Tested with:
> > >>   ./test_progs -t xdp_veth_egress
> > >>   ./test_progs -t xdp_veth
> > >>   ./test_progs -t xdp
> > > [...]
> > >>
> > >>  destroy_xdp_redirect_map:
> > >> --
> > >> 2.43.0
> > >>
> > >>
> > >>
> >
>
>
>
>

Thanks for all the comments.

I agree this should not be treated as a generic skb_clone() issue, and I
understand the concern about changing existing clone/shared-data semantics.

The case I was trying to address is narrower: generic XDP devmap
broadcast/multi redirect with per-devmap-entry egress programs. The use case is
not newly introduced by this patch. The existing xdp_veth_egress test already
attaches xdp_devmap_prog to all egress devmap entries, and that program rewrites
eth->h_source based on ctx->egress_ifindex. My change only strengthens the test
from checking that the observed MAC is not equal to a single magic MAC to
checking that each destination observes the MAC selected for its own egress
ifindex.

So attaching an egress program to all devmap entries is already what this test
does. The SKB_MODE failure happens because the cloned skbs can still share the
packet data, and a later per-egress rewrite can be observed by another
destination.

That said, I see the concern that this may need a clearer semantic boundary
between clone and copy behavior. I will also check the last_dst path pointed out
by Sashiko before deciding whether this should be handled as a bug fix, or
whether it needs a separate explicit copy/isolated redirect semantic instead.

Reply via email to