2018-02-05 14:42 GMT+01:00 Jesper Dangaard Brouer <bro...@redhat.com>:
> On Wed, 31 Jan 2018 14:53:37 +0100 Björn Töpel <bjorn.to...@gmail.com> wrote:
>
>> The bpf_xdpsk_redirect call redirects the XDP context to the XDP
>> socket bound to the receiving queue (if any).
>
> As I explained in-person at FOSDEM, my suggestion is to use the
> bpf-map infrastructure for AF_XDP redirect, but people on this
> upstream mailing also need a chance to validate my idea ;-)
>
> The important thing to keep in-mind is how we can still maintain a
> SPSC (Single producer Single Consumer) relationship between an
> RX-queue and a userspace consumer-process.
>
> This AF_XDP "FOSDEM" patchset, store the "xsk" (xdp_sock) pointer
> directly in the net_device (_rx[].netdev_rx_queue.xs) structure.  This
> limit each RX-queue to service a single xdp_sock.  It sounds good from
> a SPSC pov, but not very flexible.  With a "xdp_sock_map" we can get
> the flexibility to select among multiple xdp_sock'ets (via XDP
> pre-filter selecting a different map), and still maintain a SPSC
> relationship.  The RX-queue will just have several SPSC relationships
> with the individual xdp_sock's.
>
> This is true for the AF_XDP-copy mode, and require less driver change.
> For the AF_XDP-zero-copy (ZC) mode drivers need significant changes
> anyhow, and in ZC case we will have to disallow this multiple
> xdp_sock's, which is simply achieved by checking if the xdp_sock
> pointer returned from the map lookup match the one that userspace
> requested driver to register it's memory for RX-rings from.
>
> The "xdp_sock_map" is an array, where the index correspond to the
> queue_index.  The bpf_redirect_map() ignore the specified index and
> instead use xdp_rxq_info->queue_index in the lookup.
>
> Notice that a bpf-map have no pinned relationship with the device or
> XDP prog loaded.  Thus, userspace need to bind() this map to the
> device before traffic can flow, like the proposed bind() on the
> xdp_sock.  This is to establish the SPSC binding.  My proposal is that
> userspace insert the xdp_sock file-descriptor(s) in the map at the
> queue-index, and the map (which is also just a file-descriptor) is
> bound maybe via bind() to a specific device (via the ifindex).  Kernel
> side will walk the map and do required actions xdp_sock's in find in
> map.
>

As we discussed at FOSDEM, I like the idea of using a map. This also
opens up for configuring the AF_XDP sockets via bpf code, like sockmap
does.

I'll have a stab at adding an "xdp_sock_map/xskmap" or similar, and
also extending the cgroup sock_ops to support AF_XDP sockets, so that
the xskmap can be configured from bpf-land.


Björn

> TX-side is harder, as now multiple xdp_sock's can have the same
> queue-pair ID with the same net_device. But Magnus propose that this
> can be solved with hardware. As newer NICs have many TX-queue, and the
> queue-pair ID is just an external visible number, while the kernel
> internal structure can point to a dedicated TX-queue per xdp_sock.
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

Reply via email to