Hi Zhang Qi, I'm not familiar with DPDK code, but I'm curious about the benefits of using AF_XDP pmd, specifically I have a couple questions:
1) With zero-copy driver support, is AF_XDP pmd expects to have similar performance than other pmd? Since AF_XDP is still using native device driver, isn't the interrupt still there and not "poll-mode" anymore? 2) does the patch expect user to customize the ebpf/xdp code so that this becomes another way to extend dpdk datapath? Thank you William On Thu, Aug 16, 2018 at 7:42 AM Qi Zhang <qi.z.zh...@intel.com> wrote: > > Overview > ======== > > The patch set add a new PMD driver for AF_XDP which is a proposed > faster version of AF_PACKET interface in Linux, see below link for > detail AF_XDP introduction: > https://lwn.net/Articles/750845/ > https://fosdem.org/2018/schedule/event/af_xdp/ > > AF_XDP roadmap > ============== > - The kernel 4.18 is out and af_xdp is included. > https://kernelnewbies.org/Linux_4.18 > - So far there is no zero copy supported driver be merged, but some are > on the way. > > Change logs > =========== > > v3: > - Re-work base on AF_XDP's interface changes. > - Support multi-queues, each dpdk queue has its own xdp socket. > An xdp socket is always bound to a netdev queue. > We assume all xdp socket from the same ethdev are bound to the > same netdev queue, though a netdev queue still can be bound by > xdp sockets from different ethdev instances. > Below is an example of the mapping. > ------------------------------------------------------ > | dpdk q0 | dpdk q1 | dpdk q0 | dpdk q0 | dpdk q1 | > ------------------------------------------------------ > | xsk A | xsk B | xsk C | xsk D | xsk E |<---| > ------------------------------------------------------ | > | ETHDEV 0 | ETHDEV 1 | ETHDEV 2 | | DPDK > ------------------------------------------------------------------ > | netdev queue 0 | netdev queue 1 | | KERNEL > ------------------------------------------------------ | > | NETDEV eth0 | | > ------------------------------------------------------ | > | key xsk | | > | ---------- -------------- | | > | | | | 0 | xsk A | | | > | | | -------------- | | > | | | | 2 | xsk B | | | > | | ebpf | --------------------------------------- > | | | | 3 | xsk C | | > | | redirect ->|-------------- | > | | | | 4 | xsk D | | > | | | -------------- | > | |---------| | 5 | xsk E | | > | -------------- | > |----------------------------------------------------- > > - It is an open question that how to load ebpf to kernel and link to > specific netdev in DPDK, should it be part of PMD, or it should be handled > by > an independent tool? In this patchset, it takes the second option, there > will > be a "bind" stage before we start AF_XDP PMD, this includes below steps: > a) load ebpf program to the kernel, (the ebpf program must contain the > logic to redirect packet to a xdp socket base on a redirect map). > b) link ebpf program to specific network interface. > c) expose the xdp socket redirect map id and entries number to user, > so this will be parsed to PMD, and PMD will create xdp socket > for each queue and update the redirect map correctly. > (example: > --vdev,iface=eth0,xsk_map_id=53,xsk_map_key_base=0,xsk_map_key_count=4) > > v2: > - fix lisence header > - clean up bpf dependency, bpf program is embedded, no "xdpsock_kern.o" > required > - clean up make file, only linux_header is required > - fix all the compile warning. > - fix packet number return in Tx. > > How to try > ========== > > 1. Take the kernel v4.18. > make sure you turn on XDP sockets when compiling > Networking support --> > Networking options --> > [ * ] XDP sockets > 2. in the kernel source code, apply below patch and compile the bpf sample > code. > #make samples/bpf/ > so the sample xdpsock can be used as a bind/unbind tool for af_xdp > PMD, sorry for this ugly, but in future, there could be a dedicated > tool in DPDK, if we agree with the idea that bpf configure in the kernel > should be separated from PMD. > > ~~~~~~~~~~~~~~~~~~~~~~~PATCH START~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c > index d69c8d78d3fd..44a6318043e7 100644 > --- a/samples/bpf/xdpsock_user.c > +++ b/samples/bpf/xdpsock_user.c > @@ -76,6 +76,8 @@ static int opt_poll; > static int opt_shared_packet_buffer; > static int opt_interval = 1; > static u32 opt_xdp_bind_flags; > +static int opt_bind; > +static int opt_unbind; > > struct xdp_umem_uqueue { > u32 cached_prod; > @@ -662,6 +664,8 @@ static void usage(const char *prog) > " -S, --xdp-skb=n Use XDP skb-mod\n" > " -N, --xdp-native=n Enfore XDP native mode\n" > " -n, --interval=n Specify statistics update interval > (default 1 sec).\n" > + " -b, --bind Bind only.\n" > + " -u, --unbind Unbind only.\n" > "\n"; > fprintf(stderr, str, prog); > exit(EXIT_FAILURE); > @@ -674,7 +678,7 @@ static void parse_command_line(int argc, char **argv) > opterr = 0; > > for (;;) { > - c = getopt_long(argc, argv, "rtli:q:psSNn:", long_options, > + c = getopt_long(argc, argv, "rtli:q:psSNn:bu", long_options, > &option_index); > if (c == -1) > break; > @@ -711,6 +715,12 @@ static void parse_command_line(int argc, char **argv) > case 'n': > opt_interval = atoi(optarg); > break; > + case 'b': > + opt_bind = 1; > + break; > + case 'u': > + opt_unbind = 1; > + break; > default: > usage(basename(argv[0])); > } > @@ -898,6 +908,12 @@ int main(int argc, char **argv) > exit(EXIT_FAILURE); > } > > + if (opt_unbind) { > + bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags); > > ~~~~~~~~~~~~~~~~~~~~~~~PATCH END~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > 3. bind > #./samples/bpf/xdpsock -i eth0 -b > > in this step, an ebpf binary xdpsock_kern.o is be loaded into the kernel > and linked to eth0, the ebpf source code is /samples/bpf/xdpsock_kern.c > you can modify it and re-compile for a different test. > > 4. dump xdp socket map information. > #./tools/bpf/bpftool/bpftool map -p, you will see something like below. > > },{ > "id": 56, > "type": "xskmap", > "name": "xsks_map", > "flags": 0, > "bytes_key": 4, > "bytes_value": 4, > "max_entries": 4, > "bytes_memlock": 4096 > } > > in this case 56 is the map id and it has 4 entries > > 5. start testpmd > > ./build/app/testpmd -c 0xc -n 4 --vdev > eth_af_xdp,iface=enp59s0f0,xsk_map_id=56,xsk_map_key_start=2xsk_map_key_count=2 > -- -i --rxq=2 --txq=2 > > in this case, we reserved 2 entries (2,3) in the map, and they will be > mapped to queue 0 and queue 1. > > 6. unbind after test > ./sample/bpf/xdpsock -i eth0 -u. > > Performance > =========== > Since no zero copy driver is ready yet. > So far only tested with DRV and SKB mode on i40e 25G > the result show identical with kernel sample "xdpsock" > > Qi Zhang (6): > net/af_xdp: new PMD driver > lib/mbuf: enable parse flags when create mempool > lib/mempool: allow page size aligned mempool > net/af_xdp: use mbuf mempool for buffer management > net/af_xdp: enable zero copy > app/testpmd: add mempool flags parameter > > app/test-pmd/parameters.c | 12 + > app/test-pmd/testpmd.c | 15 +- > app/test-pmd/testpmd.h | 1 + > config/common_base | 5 + > config/common_linuxapp | 1 + > drivers/net/Makefile | 1 + > drivers/net/af_xdp/Makefile | 30 + > drivers/net/af_xdp/meson.build | 7 + > drivers/net/af_xdp/rte_eth_af_xdp.c | 1345 > +++++++++++++++++++++++++ > drivers/net/af_xdp/rte_pmd_af_xdp_version.map | 4 + > lib/librte_mbuf/rte_mbuf.c | 15 +- > lib/librte_mbuf/rte_mbuf.h | 8 +- > lib/librte_mempool/rte_mempool.c | 3 + > lib/librte_mempool/rte_mempool.h | 1 + > mk/rte.app.mk | 1 + > 15 files changed, 1439 insertions(+), 10 deletions(-) > create mode 100644 drivers/net/af_xdp/Makefile > create mode 100644 drivers/net/af_xdp/meson.build > create mode 100644 drivers/net/af_xdp/rte_eth_af_xdp.c > create mode 100644 drivers/net/af_xdp/rte_pmd_af_xdp_version.map > > -- > 2.13.6 >