On Fri, 8 Jul 2016 09:45:25 -0700, John Fastabend wrote: > The only distinction between VFs and queue groupings on my side is VFs > provide RSS where as queue groupings have to be selected explicitly. > In a programmable NIC world the distinction might be lost if a "RSS" > program can be loaded into the NIC to select queues but for existing > hardware the distinction is there.
To do BPF RSS we need a way to select the queue which I think is all Jasper wanted. So we will have to tackle the queue selection at some point. The main obstacle with it for me is to define what queue selection means when program is not offloaded to HW... Implementing queue selection on HW side is trivial. > If you demux using a eBPF program or via a filter model like > flow_director or cls_{u32|flower} I think we can support both. And this > just depends on the programmability of the hardware. Note flow_director > and cls_{u32|flower} steering to VFs is already in place. Yes, for steering to VFs we could potentially reuse a lot of existing infrastructure. > The question I have is should the "filter" part of the eBPF program > be a separate program from the XDP program and loaded using specific > semantics (e.g. "load_hardware_demux" ndo op) at the risk of building > a ever growing set of "ndo" ops. If you are running multiple XDP > programs on the same NIC hardware then I think this actually makes > sense otherwise how would the hardware and even software find the > "demux" logic. In this model there is a "demux" program that selects > a queue/VF and a program that runs on the netdev queues. I don't think we should enforce the separation here. What we may want to do before forwarding to the VF can be much more complicated than pure demux/filtering (simple eg - pop VLAN/tunnel). VF representative model works well here as fallback - if program could not be offloaded it will be run on the host and "trombone" packets via VFR into the VF. If we have a chain of BPF programs we can order them in increasing level of complexity/features required and then HW could transparently offload the first parts - the easier ones - leaving more complex processing on the host. This should probably be paired with some sort of "skip-sw" flag to let user space enforce the HW offload on the fast path part. _______________________________________________ iovisor-dev mailing list iovisor-dev@lists.iovisor.org https://lists.iovisor.org/mailman/listinfo/iovisor-dev