On Mon, Feb 8, 2021 at 4:58 AM Ilya Maximets <i.maxim...@ovn.org> wrote: > > On 2/7/21 5:05 PM, Toshiaki Makita wrote: > > On 2021/02/07 2:00, William Tu wrote: > >> On Fri, Feb 5, 2021 at 1:08 PM Gregory Rose <gvrose8...@gmail.com> wrote: > >>> On 2/4/2021 7:08 PM, William Tu wrote: > >>>> On Thu, Feb 4, 2021 at 3:17 PM Gregory Rose <gvrose8...@gmail.com> wrote: > >>>>> On 2/3/2021 1:21 PM, William Tu wrote: > >>>>>> Mellanox card has different XSK design. It requires users to create > >>>>>> dedicated queues for XSK. Unlike Intel's NIC which loads XDP program > >>>>>> to all queues, Mellanox only loads XDP program to a subset of its > >>>>>> queue. > >>>>>> > >>>>>> When OVS uses AF_XDP with mlx5, it doesn't replace the existing RX and > >>>>>> TX > >>>>>> queues in the channel with XSK RX and XSK TX queues, but it creates an > >>>>>> additional pair of queues for XSK in that channel. To distinguish > >>>>>> regular and XSK queues, mlx5 uses a different range of qids. > >>>>>> That means, if the card has 24 queues, queues 0..11 correspond to > >>>>>> regular queues, and queues 12..23 are XSK queues. > >>>>>> In this case, we should attach the netdev-afxdp with 'start-qid=12'. > >>>>>> > >>>>>> I tested using Mellanox Connect-X 6Dx, by setting 'start-qid=1', and: > >>>>>> $ ethtool -L enp2s0f0np0 combined 1 > >>>>>> # queue 0 is for non-XDP traffic, queue 1 is for XSK > >>>>>> $ ethtool -N enp2s0f0np0 flow-type udp4 action 1 > >>>>>> note: we need additionally add flow-redirect rule to queue 1 > >>>>> > >>>>> Seems awfully hardware dependent. Is this just for Mellanox or does > >>>>> it have general usefulness? > >>>>> > >>>> It is just Mellanox's design which requires pre-configure the > >>>> flow-director. > >>>> I only have cards from Intel and Mellanox so I don't know about other > >>>> vendors. > >>>> > >>>> Thanks, > >>>> William > >>>> > >>> > >>> I think we need to abstract the HW layer a little bit. This start-qid > >>> option is specific to a single piece of HW, at least at this point. > >>> We should expect that further HW specific requirements for > >>> different NIC vendors will come up in the future. I suggest > >>> adding a hw_options:mellanox:start-qid type hierarchy so that > >>> as new HW requirements come up we can easily scale. It will > >>> also make adding new vendors easier in the future. > >>> > >>> Even with NIC vendors you can't always count on each new generation > >>> design to always keep old requirements and methods for feature > >>> enablement. > >>> > >>> What do you think? > >>> > >> Thanks for the feedback. > >> So far I don't know whether other vendors will need this option or not. > > > > FWIU, this api "The lower half of the available amount of RX queues are > > regular queues, and the upper half are XSK RX queues." is the result of > > long discussion to support dedicated/isolated XSK rings, which is not meant > > for a mellanox-specific feature. > > > > https://patchwork.ozlabs.org/project/netdev/cover/20190524093431.20887-1-maxi...@mellanox.com/ > > https://patchwork.ozlabs.org/project/netdev/cover/20190612155605.22450-1-maxi...@mellanox.com/ > > > > Toshiaki Makita > > Thanks for the links. Very helpful. > > From what I understand lower half of queues should still work, i.e. > it should still be possible to attach AF_XDP socket to them. But > they will not work in zero-copy mode ("generic" only?). > William, could you check that? Does it work and with which mode > "best-effort" ends up with? And what kind of errors libbpf returns > if we're trying to enable zero-copy?
Thanks for your feedback. Yes, only zero-copy mode needs to be aware of this, meaning zero-copy mode has to use the upper half of the queues (the start-qid option here). Native mode and SKB mode works OK on upper and lower queues. When attaching zc XSK to lower half queue, libbpf returns EINVAL at xsk_socket__create(). William _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev