On Mon, Feb 8, 2021 at 6:36 AM Ilya Maximets <i.maxim...@ovn.org> wrote: > > On 2/8/21 1:56 AM, Yi Yang (杨燚)-云服务集团 wrote: > > Yes, GSO is ok, but GRO may be have that issue, I didn't see that issue in > > my openstack environment, so maybe it will be great if we can have a test > > case to trigger that issue. > > > > -----邮件原件----- > > 发件人: William Tu [mailto:u9012...@gmail.com] > > 发送时间: 2021年2月7日 23:46 > > 收件人: Yi Yang (杨燚)-云服务集团 <yangy...@inspur.com> > > 抄送: i.maxim...@ovn.org; yang_y...@163.com; ovs-dev@openvswitch.org; > > f...@sysclose.org > > 主题: Re: [ovs-dev] 答复: [PATCH V3 2/4] Add GSO support for DPDK data path > > > > On Tue, Oct 27, 2020 at 6:02 PM Yi Yang (杨燚)-云服务集团 <yangy...@inspur.com> > > wrote: > >> > >> -----邮件原件----- > >> 发件人: dev [mailto:ovs-dev-boun...@openvswitch.org] 代表 Ilya Maximets > >> 发送时间: 2020年10月27日 21:03 > >> 收件人: yang_y...@163.com; ovs-dev@openvswitch.org > >> 抄送: f...@sysclose.org; i.maxim...@ovn.org > >> 主题: Re: [ovs-dev] [PATCH V3 2/4] Add GSO support for DPDK data path > >> > >> On 8/7/20 12:56 PM, yang_y...@163.com wrote: > >>> From: Yi Yang <yangy...@inspur.com> > >>> > >>> GSO(Generic Segment Offload) can segment large UDP and TCP packet > >>> to small packets per MTU of destination , especially for the case > >>> that physical NIC can't do hardware offload VXLAN TSO and VXLAN UFO, > >>> GSO can make sure userspace TSO can still work but not drop. > >>> > >>> In addition, GSO can help improve UDP performane when UFO is enabled > >>> in VM. > >>> > >>> GSO can support TCP, UDP, VXLAN TCP, VXLAN UDP, it is done in Tx > >>> function of physical NIC. > >>> > >>> Signed-off-by: Yi Yang <yangy...@inspur.com> > >>> --- > >>> lib/dp-packet.h | 21 +++- > >>> lib/netdev-dpdk.c | 358 > >>> +++++++++++++++++++++++++++++++++++++++++++++++++---- > >>> lib/netdev-linux.c | 17 ++- > >>> lib/netdev.c | 67 +++++++--- > >>> 4 files changed, 417 insertions(+), 46 deletions(-) > > > > snip > > > >>> > >>> @@ -2339,24 +2428,19 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk > >>> *dev, struct rte_mbuf **pkts, > >>> return cnt; > >>> } > >>> > >>> -/* Tries to transmit 'pkts' to txq 'qid' of device 'dev'. Takes > >>> ownership of > >>> - * 'pkts', even in case of failure. > >>> - * > >>> - * Returns the number of packets that weren't transmitted. */ > >>> static inline int -netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int > >>> qid, > >>> - struct rte_mbuf **pkts, int cnt) > >>> +__netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid, > >>> + struct rte_mbuf **pkts, int cnt) > >>> { > >>> uint32_t nb_tx = 0; > >>> - uint16_t nb_tx_prep = cnt; > >>> + uint32_t nb_tx_prep; > >>> > >>> - if (userspace_tso_enabled()) { > >>> - nb_tx_prep = rte_eth_tx_prepare(dev->port_id, qid, pkts, cnt); > >>> - if (nb_tx_prep != cnt) { > >>> - VLOG_WARN_RL(&rl, "%s: Output batch contains invalid > >>> packets. " > >>> - "Only %u/%u are valid: %s", dev->up.name, > >>> nb_tx_prep, > >>> - cnt, rte_strerror(rte_errno)); > >>> - } > >>> + nb_tx_prep = rte_eth_tx_prepare(dev->port_id, qid, pkts, cnt); > >>> + if (nb_tx_prep != cnt) { > >>> + VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. " > >>> + "Only %u/%u are valid: %s", > >>> + dev->up.name, nb_tx_prep, > >>> + cnt, rte_strerror(rte_errno)); > >>> } > >>> > >>> while (nb_tx != nb_tx_prep) { > >>> @@ -2384,6 +2468,200 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, > >>> int qid, > >>> return cnt - nb_tx; > >>> } > >>> > >>> +static inline void > >>> +set_multiseg_udptcp_cksum(struct rte_mbuf *mbuf) > >> > >> I didn't review the patch, only had a quick glance, but this part bothers > >> me. OVS doesn't support multi-segment mbufs, so it should not be possible > >> for such mbufs being transmitted by OVS. So, I do not understand why this > >> function needs to work with such mbufs. > >> > >> [Yi Yang] Only DPDK driver/Tx function will use it, not OVS, > >> set_multiseg_udptcp_cksum is called in GSO part, it is last step before Tx > >> function, it is a big external mbuf before rte_gso_segment, that isn't a > >> multi-segmented mbuf. > >> > > > > Hi Ilya, > > > > Now I understand Yi Yang's point better and I agree with him. > > Looks like the patch does the GSO at the DPDK TX function. > > It creates multi-seg mbuf after rte_gso_segment(), but will immediately > > send out the multi-seg mbuf to DPDK port, without traversing inside other > > part of OVS code. I guess this case it should work OK? > > Hi. GSO itself should be possible to implement, but we should > not enable support for multi-segment mbufs in the NIC, since this > might end up in multi-segment packets being received by OVS > causing lots of trouble (for example dp_packet_clone() uses simple > memcpy() that will result in invalid memory reads.)
I see your point, thanks! I guess you're saying that we have to do s.t like: nic->tx_offloads & DEV_TX_OFFLOAD_MULTI_SEGS to enable NIC's multi-segment support for GSO. And once enabling it, OVS might receive multi-segment mbuf from the NIC, and OVS doesn't support it. > > GSO should be implemented in a same way TSO implemented. Right now > TSO implementation has no software fallback to segment large packets > if netdev doesn't support - it it simply drops them. However, > software fallback should be implemented someday and it should be > implemented in a generic way for all the netdev types instead of > doing it only for one (netdev-dpdk). Brief look at the patch tells > me that the generic check from netdev_send_prepare_packet() was > removed. This means that huge packets could reach e.g. netdev-afxdp > that will not behave correctly. > Generic software fallback should produce sequence of 'dp_packet's > instead of multi-segment mbufs, so there should be no need to enable > support for multi-segment mbufs in DPDK PMD. > I agree, thanks! So which one do we need, for dp_packet: 1) GSO 2) Tunneled GSO, ex: vxlan and geneve, or 3) Both? I feel we can assume all HW nic today support TSO, but not tunnel packet's TSO. So we only need to implement (2). William _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev