Re: [ovs-dev] 答复: [PATCH V3 3/4] Add VXLAN TCP and UDP GRO support for DPDK data path

2021-02-06 Thread William Tu
On Tue, Oct 27, 2020 at 5:50 PM Yi Yang (杨燚)-云服务集团  wrote:
>
> -邮件原件-
> 发件人: dev [mailto:ovs-dev-boun...@openvswitch.org] 代表 Ilya Maximets
> 发送时间: 2020年10月27日 21:12
> 收件人: yang_y...@163.com; ovs-dev@openvswitch.org
> 抄送: f...@sysclose.org; i.maxim...@ovn.org
> 主题: Re: [ovs-dev] [PATCH V3 3/4] Add VXLAN TCP and UDP GRO support for DPDK 
> data path
>
> On 8/7/20 12:56 PM, yang_y...@163.com wrote:
> > From: Yi Yang 
> >
> > GRO(Generic Receive Offload) can help improve performance when TSO
> > (TCP Segment Offload) or VXLAN TSO is enabled on transmit side, this
> > can avoid overhead of ovs DPDK data path and enqueue vhost for VM by
> > merging many small packets to large packets (65535 bytes at most) once
> > it receives packets from physical NIC.
>
> IIUC, this patch allows multi-segment mbufs to float across different parts 
> of OVS.  This will definitely crash it somewhere.  Much more changes all over 
> the OVS required to make it safely work with such mbufs.  There were few 
> attempts to introduce this support, but all of them ended up being rejected.  
> As it is this patch is not acceptable as it doesn't cover almost anything 
> beside simple cases inside netdev implementation.
>
> Here is the latest attempt with multi-segment mbufs:
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=130193&state=*

Thanks, that's very helpful. Looks like a huge amount of work to
introduce multi-seg mbuf.
>
> Best regards, Ilya Maximets.
>
> [Yi Yang] We have to support this because we have supported TSO for TCP, it 
> can't handle big UDP, this is why we must introduce GSO, the prerequisite of 
> GSO is multi-segment  must be enabled because GSOed mbufs are 
> multi-segmented, but it is just last  step before dpdk Tx, so I don't think 
> it is an issue, per my test in our openstack environment, I didn't encounter 
> any crash, this just enabled DPDK PMD driver to handle GSOed mbuf. For GRO, 
> reassembling also use chained multi-segment mbuf to avoid copy, per long time 
> test, it also didn't lead to any crash. We can fix some corner cases if they 
> aren't covered.
>
I just started to understand the problem. Sorry if I missed something.
So currently what do we do to prevent DPDK sending OVS using multi-seg mbuf?
Do we check it and linearize the mbuf?
Can we make GSO/GRO working using linearized mbuf?

Regards,
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] 答复: [PATCH V3 3/4] Add VXLAN TCP and UDP GRO support for DPDK data path

2020-10-27 Thread 杨燚
-邮件原件-
发件人: dev [mailto:ovs-dev-boun...@openvswitch.org] 代表 Ilya Maximets
发送时间: 2020年10月27日 21:12
收件人: yang_y...@163.com; ovs-dev@openvswitch.org
抄送: f...@sysclose.org; i.maxim...@ovn.org
主题: Re: [ovs-dev] [PATCH V3 3/4] Add VXLAN TCP and UDP GRO support for DPDK 
data path

On 8/7/20 12:56 PM, yang_y...@163.com wrote:
> From: Yi Yang 
> 
> GRO(Generic Receive Offload) can help improve performance when TSO 
> (TCP Segment Offload) or VXLAN TSO is enabled on transmit side, this 
> can avoid overhead of ovs DPDK data path and enqueue vhost for VM by 
> merging many small packets to large packets (65535 bytes at most) once 
> it receives packets from physical NIC.

IIUC, this patch allows multi-segment mbufs to float across different parts of 
OVS.  This will definitely crash it somewhere.  Much more changes all over the 
OVS required to make it safely work with such mbufs.  There were few attempts 
to introduce this support, but all of them ended up being rejected.  As it is 
this patch is not acceptable as it doesn't cover almost anything beside simple 
cases inside netdev implementation.

Here is the latest attempt with multi-segment mbufs:
https://patchwork.ozlabs.org/project/openvswitch/list/?series=130193&state=*

Best regards, Ilya Maximets.

[Yi Yang] We have to support this because we have supported TSO for TCP, it 
can't handle big UDP, this is why we must introduce GSO, the prerequisite of 
GSO is multi-segment  must be enabled because GSOed mbufs are multi-segmented, 
but it is just last  step before dpdk Tx, so I don't think it is an issue, per 
my test in our openstack environment, I didn't encounter any crash, this just 
enabled DPDK PMD driver to handle GSOed mbuf. For GRO, reassembling also use 
chained multi-segment mbuf to avoid copy, per long time test, it also didn't 
lead to any crash. We can fix some corner cases if they aren't covered. 


> 
> It can work for both VXLAN and vlan case.
> 
> Signed-off-by: Yi Yang 
> ---
>  lib/dp-packet.h|  37 -
>  lib/netdev-dpdk.c  | 227 
> -
>  lib/netdev-linux.c | 112 --
>  3 files changed, 365 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h index c33868d..18307c0 
> 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -580,7 +580,16 @@ dp_packet_set_size(struct dp_packet *b, uint32_t v)
>   * (and thus 'v') will always be <= UINT16_MAX; this means that there is 
> no
>   * loss of accuracy in assigning 'v' to 'data_len'.
>   */
> -b->mbuf.data_len = (uint16_t)v;  /* Current seg length. */
> +if (b->mbuf.nb_segs <= 1) {
> +b->mbuf.data_len = (uint16_t)v;  /* Current seg length. */
> +} else {
> +/* For multi-seg packet, if it is resize, data_len should be
> + * adjusted by offset, this will happend in case of push or pop.
> + */
> +if (b->mbuf.pkt_len != 0) {
> +b->mbuf.data_len += v - b->mbuf.pkt_len;
> +}
> +}
>  b->mbuf.pkt_len = v; /* Total length of all segments linked 
> to
>* this segment. */  } @@ 
> -1092,6 +1101,20 @@ dp_packet_hwol_set_l4_len(struct dp_packet *b, int 
> l4_len)  {
>  b->mbuf.l4_len = l4_len;
>  }
> +
> +/* Set outer_l2_len for the packet 'b' */ static inline void 
> +dp_packet_hwol_set_outer_l2_len(struct dp_packet *b, int 
> +outer_l2_len) {
> +b->mbuf.outer_l2_len = outer_l2_len; }
> +
> +/* Set outer_l3_len for the packet 'b' */ static inline void 
> +dp_packet_hwol_set_outer_l3_len(struct dp_packet *b, int 
> +outer_l3_len) {
> +b->mbuf.outer_l3_len = outer_l3_len; }
>  #else
>  /* Mark packet 'b' for VXLAN TCP segmentation offloading. */  static 
> inline void @@ -1125,6 +1148,18 @@ dp_packet_hwol_set_l4_len(struct 
> dp_packet *b OVS_UNUSED,
>int l4_len OVS_UNUSED)  {  }
> +
> +/* Set outer_l2_len for the packet 'b' */ static inline void 
> +dp_packet_hwol_set_outer_l2_len(struct dp_packet *b, int 
> +outer_l2_len) { }
> +
> +/* Set outer_l3_len for the packet 'b' */ static inline void 
> +dp_packet_hwol_set_outer_l3_len(struct dp_packet *b, int 
> +outer_l3_len) { }
>  #endif /* DPDK_NETDEV */
>  
>  static inline bool
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 888a45e..b6c57a6 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /* Include rte_compat.h first to allow experimental API's needed for the
>   * rte_meter.h rfc4115 functions. Once they are no longer marked as
> @@ -47,6 +48,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "cmap.h"
>  #include "coverage.h"
> @@ -2184,6 +2186,8 @@ get_udptcp_checksum(void *l3_hdr, void *l4_hdr, 
> uint16_t ethertype)
>  }
>  }
>  
> +#define UDP_VXLAN_ETH_HDR_SIZE 30
> +
>  /* Prepare the packet for HWOL.
>   * Return True