Thanks for the feedback. On Tue, Dec 18, 2018 at 6:14 AM Björn Töpel <bjorn.to...@gmail.com> wrote: > > Den mån 17 dec. 2018 kl 20:40 skrev William Tu <u9012...@gmail.com>: > > > > The patch series adds AF_XDP async xmit support for veth device. > > First patch add a new API for supporting non-physical NIC device to get > > packet's virtual address. The second patch implements the async xmit, > > and last patch adds example use cases. > > > > The first virtual device with AF_XDP support! Yay! > > This is only the zero-copy on the Tx side -- it's still allocations > plus copy on the ingress side? That's a bit different from the > i40e/ixgbe implementation, where zero-copy means both Tx and Rx. For Right, it's a little different than i40e/ixgbe, which is physical nic. For veth, the xmit is just placing the packet into peer device's rx queue. Here, the veth af_xdp implementation is doing an extra copy from the umem, create the packet, and triggers the receive code at peer device.
> veth I don't see that we need to support Rx right away, especially for > Tx only sockets. Still, when the netdev has accepted the umem via > ndo_bpf, the zero-copy for both Tx and Rx is assumed. We might want to > change the ndo_bpf at some point to support zero-copy for Tx, Rx, Tx > *and* Rx. > > Are you planning to add zero-copy to the ingress side, i.e. pulling > frames from the fill ring, instead of allocating via dev_alloc_page? > (The term *zero-copy* for veth is a bit weird, since we're still doing > copies, but eliding the page allocation. :-)) Yes, I'm trying to remove this dev_alloc_page, still not successful. Do you think we should go directly with the zero-copy version for next patch? > > It would be interesting to hear a bit about what use-case veth/AF_XDP > has, if you can share that. > Yes, so we've been working on OVS + AF_XDP netdev support. See OVS conference: Fast Userspace OVS with AF_XDP http://www.openvswitch.org/support/ovscon2018/ AF_XDP from OVS's perspective is just a netdev doing packet I/O, that is, a faster way to send and receive packets. With i40e/ixgbe AF_XDP support, OVS can forward packets at very high packet rate. However, users also attach virtual port to the OVS bridge, for example, tap device connected to VM, or veth peer device connected to container. So packets flow from: Physical NIC (with AF_XDP) --> OVS --> virtual port (no AF_XDP) --> VM/container Since there is no AF_XDP support for virtual devices yet, the performance drops significantly. That's kind of the motivation for this patch series, to add virtual device support for AF_XDP. Ultimately with I hope with AF_XDP support, a packet comes from a physical nic, dma directly to umem, processed by OVS or others processing software, zero-copy to tap/veth peer devices, then received by VM/container app. Thanks. William > > Cheers, > Björn > > > I tested with 2 namespaces, one as sender, the other as receiver. > > The packet rate is measure at the receiver side. > > ip netns add at_ns0 > > ip link add p0 type veth peer name p1 > > ip link set p0 netns at_ns0 > > ip link set dev p1 up > > ip netns exec at_ns0 ip link set dev p0 up > > > > # receiver > > ip netns exec at_ns0 xdp_rxq_info --dev p0 --action XDP_DROP > > > > # sender with AF_XDP > > xdpsock -i p1 -t -N -z > > > > # or sender without AF_XDP > > xdpsock -i p1 -t -S > > > > Without AF_XDP: 724 Kpps > > RXQ stats RXQ:CPU pps issue-pps > > rx_queue_index 0:1 724339 0 > > rx_queue_index 0:sum 724339 > > > > With AF_XDP: 1.1 Mpps (with ksoftirqd 100% cpu) > > RXQ stats RXQ:CPU pps issue-pps > > rx_queue_index 0:3 1188181 0 > > rx_queue_index 0:sum 1188181 > > > > William Tu (3): > > xsk: add xsk_umem_consume_tx_virtual. > > veth: support AF_XDP. > > samples: bpf: add veth AF_XDP example. > > > > drivers/net/veth.c | 247 > > ++++++++++++++++++++++++++++++++++++++++- > > include/net/xdp_sock.h | 7 ++ > > net/xdp/xsk.c | 24 ++++ > > samples/bpf/test_veth_afxdp.sh | 67 +++++++++++ > > 4 files changed, 343 insertions(+), 2 deletions(-) > > create mode 100755 samples/bpf/test_veth_afxdp.sh > > > > -- > > 2.7.4 > >