On Wed, Apr 19, 2017 at 09:38:44PM -0700, John Fastabend wrote:
> On 17-04-19 07:56 PM, Alexei Starovoitov wrote:
> > On Thu, Apr 20, 2017 at 12:51:31AM +0200, Daniel Borkmann wrote:
> >>
> >> Is there a concrete reason that all the proposed future cases like sockets
> >> have to be handled within the very same XDP_REDIRECT return code? F.e. why
> >> not XDP_TX_NIC that only assumes ifindex as proposed in the patch, and 
> >> future
> >> ones would get a different return code f.e. XDP_TX_SK only handling sockets
> >> when we get there implementation-wise?
> > 
> > yeah. let's keep redirect to sockets, tunnels, crypto and exotic things
> > out of this discussion.
> > XDP_REDIRECT should assume L2 raw packet is being redirected to another L2 
> > netdev.
> > If we make it too generic it will lose performance.
> > 
> > For cls_bpf the ifindex concept is symmetric. The program can access it as
> > skb->ifindex on receive and can redirect to another ifindex via 
> > bpf_redirect() helper.
> > Since netdev is not locked, it's actually big plus, since container 
> > management
> > control plane can simply delete netns+veth and it goes away. The program can
> > have dangling ifindex (if control plane is buggy and didn't update the bpf 
> > side),
> > but it's harmless. Packets that redirect to non-existing ifindex are 
> > dropped.
> > This approach already understood and works well, so for XDP I suggest to use
> > the same approach initially before starting to reinvent the wheel.
> > struct xdp_md needs ifindex field and xdp_redirect() helper that redirects
> > to L2 netdev only. That's it. Simple and easy.
> > I think the main use cases in John's and Jesper's minds is something like
> > xdpswitch where packets are coming from VMs and from physical eths and
> > being redirected to either physical eth or to VM via upcoming vhost+xdp 
> > support.
> > This xdp_md->ifindex + xdp_redirect(to_ifindex) will solve it just fine.
> 
> hmm I must be missing something, bpf_redirect() helper should be used as a
> return statement, e.g.
> 
>       return bpf_redirect(ifindex, flags);
> 
> Its a bit awkward to use in any other way. You would have to ensure the
> program returns TC_ACT_REDIRECT in all cases I presume. It seems incredibly
> fragile and gains nothing as far as I can tell. bpf_redirect uses per_cpu
> data and expects the core to call skb_do_redirect() to push the packet into
> the correct netdev. bpf_redirect() does not modify the skb->ifindex value,
> looking at the code now.
> 
> Are you suggesting using xdp_md to store the ifindex value instead of a per
> cpu variable in the redirect helper? 

no. i'm suggesting to store in per-cpu scratch area just like cls_bpf does.

> Do you really mean the xdp_md struct in
> the uapi headers? 

yes. since 'ifindex' needs to be part of xdp_md struct in read-only way.
Just like in cls_bpf does it.
Otherwise if we attach the same program to multiple taps it won't
know which tap the traffic arriving on and won't be able to redirect properly.

> I don't see why it needs to be in the UAPI at all. If we
> don't like per cpu variables it could be pushed as part of xdp_buff I guess.

It's not about like or dont-like per-cpu scratch area.
My main point: it works just fine for cls_bpf and i'm suggesting to do
the same for xdp_redirect, since no one ever complained about that bit of 
cls_bpf.

> My suggestion is we could add an ifindex to the xdp_md structure and have the
> receiving NIC driver populate the value assuming it is useful to programs. 
> But,
> if we use cls_bpf as a model then xdp_md ifindex is more or less independent 
> of
> the redirect helper. 

exactly. arriving ifindex is independent of xdp_redirect helper.

> In my opinion we should avoid diverging cls bpf and xdp bpf
> in subtle ways like handling of ifindex and redirect.

exactly. I'm saying the same thing.
I'm not sure which part of my proposal was so confusing.
Sorry about that.

Reply via email to