On Thu, Feb 06, 2020 at 07:34:35AM +0000, Yi Yang (杨燚)-云服务集团 wrote: > Hi, Flavio > > I tried current ovs master and your two patches (this one and the > one for tap), tap to tap and veth to veth performance are very bad > when tso is on, I think merged tso patch series are wrong for such > use cases, BTW, the performance data is normal if I turned off tso > "sudo ip netns exec nsXXX ethtool -K vethX tso off", for tap, the > same is there (I used your tap patch for this), it is ok when I > turned off tso.
Can you try out the code from here instead: https://github.com/fleitner/ovs/tree/tso-tap-enable-tx-v1 > here is performance data: > > Connecting to host 10.15.1.3, port 5201 > [ 4] local 10.15.1.2 port 36024 connected to 10.15.1.3 port 5201 > [ ID] Interval Transfer Bandwidth Retr Cwnd > [ 4] 0.00-10.00 sec 7.59 MBytes 6.37 Mbits/sec 2410 7.07 KBytes > [ 4] 10.00-20.00 sec 7.76 MBytes 6.51 Mbits/sec 2496 5.66 KBytes > [ 4] 20.00-30.00 sec 7.99 MBytes 6.70 Mbits/sec 2536 5.66 KBytes > [ 4] 30.00-40.00 sec 7.85 MBytes 6.58 Mbits/sec 2506 5.66 KBytes > [ 4] 40.00-50.00 sec 7.93 MBytes 6.65 Mbits/sec 2556 5.66 KBytes > [ 4] 50.00-60.00 sec 8.25 MBytes 6.92 Mbits/sec 2696 7.07 KBytes > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-60.00 sec 47.4 MBytes 6.62 Mbits/sec 15200 sender > [ 4] 0.00-60.00 sec 47.3 MBytes 6.61 Mbits/sec receiver > > Server output: > Accepted connection from 10.15.1.2, port 36022 > [ 5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 36024 > [ ID] Interval Transfer Bandwidth > [ 5] 0.00-10.00 sec 7.44 MBytes 6.24 Mbits/sec > [ 5] 10.00-20.00 sec 7.80 MBytes 6.54 Mbits/sec > [ 5] 20.00-30.00 sec 7.95 MBytes 6.67 Mbits/sec > [ 5] 30.00-40.00 sec 7.83 MBytes 6.57 Mbits/sec > [ 5] 40.00-50.00 sec 8.00 MBytes 6.71 Mbits/sec > [ 5] 50.00-60.00 sec 8.23 MBytes 6.91 Mbits/sec Yeah, that looks like to be the performance only for TCP retransmits. Here is your script running with the code I provided before/above: [root@wsfd-netdev93 yiyang]# ./yiyang-test.sh PING 10.15.1.3 (10.15.1.3) 56(84) bytes of data. 64 bytes from 10.15.1.3: icmp_seq=1 ttl=64 time=0.425 ms 64 bytes from 10.15.1.3: icmp_seq=2 ttl=64 time=0.253 ms 64 bytes from 10.15.1.3: icmp_seq=3 ttl=64 time=0.356 ms --- 10.15.1.3 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 44ms rtt min/avg/max/mdev = 0.253/0.344/0.425/0.073 ms PING 10.15.1.2 (10.15.1.2) 56(84) bytes of data. 64 bytes from 10.15.1.2: icmp_seq=1 ttl=64 time=0.302 ms 64 bytes from 10.15.1.2: icmp_seq=2 ttl=64 time=0.475 ms 64 bytes from 10.15.1.2: icmp_seq=3 ttl=64 time=0.383 ms --- 10.15.1.2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 43ms rtt min/avg/max/mdev = 0.302/0.386/0.475/0.074 ms Connecting to host 10.15.1.3, port 5201 [ 5] local 10.15.1.2 port 59660 connected to 10.15.1.3 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 7.31 GBytes 6.28 Gbits/sec 49510 187 KBytes [ 5] 10.00-20.00 sec 7.37 GBytes 6.33 Gbits/sec 51415 215 KBytes [ 5] 20.00-30.00 sec 7.30 GBytes 6.27 Gbits/sec 46197 182 KBytes [ 5] 30.00-40.00 sec 7.30 GBytes 6.27 Gbits/sec 46344 202 KBytes [ 5] 40.00-50.00 sec 7.16 GBytes 6.15 Gbits/sec 49123 287 KBytes [ 5] 50.00-60.00 sec 7.33 GBytes 6.30 Gbits/sec 48734 214 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-60.00 sec 43.8 GBytes 6.27 Gbits/sec 291323 sender [ 5] 0.00-60.00 sec 43.8 GBytes 6.27 Gbits/sec receiver Server output: Accepted connection from 10.15.1.2, port 59658 [ 5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 59660 [ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 7.30 GBytes 6.27 Gbits/sec [ 5] 10.00-20.00 sec 7.37 GBytes 6.33 Gbits/sec [ 5] 20.00-30.00 sec 7.30 GBytes 6.27 Gbits/sec [ 5] 30.00-40.00 sec 7.30 GBytes 6.27 Gbits/sec [ 5] 40.00-50.00 sec 7.16 GBytes 6.15 Gbits/sec [ 5] 50.00-60.00 sec 7.33 GBytes 6.30 Gbits/sec [ 5] 60.00-60.00 sec 127 KBytes 4.09 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-60.00 sec 43.8 GBytes 6.27 Gbits/sec receiver iperf Done. HTH, fbl > > > iperf Done. > > You can use the below script to check performance (need sudo run-iperf3.sh). > > $ cat run-iperf3.sh > #!/bin/bash > > ovs-vsctl add-br br-int -- set bridge br-int datapath_type=netdev > protocols=OpenFlow10,OpenFlow12,OpenFlow13 stp_enable=false > ip link add veth1 type veth peer name vethbr1 > ip link add veth2 type veth peer name vethbr2 > ip netns add ns01 > ip netns add ns02 > > ip link set veth1 netns ns01 > ip link set veth2 netns ns02 > > ip netns exec ns01 ifconfig lo 127.0.0.1 up > ip netns exec ns01 ifconfig veth1 10.15.1.2/24 up > > ip netns exec ns02 ifconfig lo 127.0.0.1 up > ip netns exec ns02 ifconfig veth2 10.15.1.3/24 up > > ifconfig vethbr1 0 up > ifconfig vethbr2 0 up > > > ovs-vsctl add-port br-int vethbr1 > ovs-vsctl add-port br-int vethbr2 > > ip netns exec ns01 ping 10.15.1.3 -c 3 > ip netns exec ns02 ping 10.15.1.2 -c 3 > > ip netns exec ns02 iperf3 -s -i 10 -D > ip netns exec ns01 iperf3 -t 60 -i 10 -c 10.15.1.3 --get-server-output > > -----邮件原件----- > 发件人: Flavio Leitner [mailto:f...@sysclose.org] > 发送时间: 2020年2月5日 19:46 > 收件人: Yi Yang (杨燚)-云服务集团 <yangy...@inspur.com> > 抄送: d...@openvswitch.org; i.maxim...@ovn.org; txfh2...@aliyun.com > 主题: Re: 答复: [ovs-dev] 答复: [PATCH v2] netdev-linux: Prepend the std packet in > the TSO packet > > On Wed, Feb 05, 2020 at 12:05:23AM +0000, Yi Yang (杨燚)-云服务集团 wrote: > > Thanks Flavio, which kernel version can support TSO for tap device? > > That ioctl was introduced in 2.6.27. > fbl > > > > > -----邮件原件----- > > 发件人: Flavio Leitner [mailto:f...@sysclose.org] > > 发送时间: 2020年2月5日 1:12 > > 收件人: Yi Yang (杨燚)-云服务集团 <yangy...@inspur.com> > > 抄送: d...@openvswitch.org; i.maxim...@ovn.org; txfh2...@aliyun.com > > 主题: Re: [ovs-dev] 答复: [PATCH v2] netdev-linux: Prepend the std packet > > in the TSO packet > > > > On Tue, Feb 04, 2020 at 12:00:19PM -0300, Flavio Leitner wrote: > > > On Tue, Feb 04, 2020 at 12:51:24AM +0000, Yi Yang (杨 D)-云服务集团 wrote: > > > > Hi, Flavio > > > > > > > > With this one patch and previous several merged TSO-related > > > > patches, can veth work with "ethtool -K vethX tx on"? I always > > > > can't figure out why veth can work in dpdk data path when tx > > > > offload features are on, it looks like you're fixing this big issue, > > > > right? > > > > > > If you have tso enabled with dpdk, then veth works with vethX tx on > > > (which is the default setting for veth). Otherwise TSO is not > > > enabled and then you need to turn tx offloading off. > > > > > > You said "can work in dpdk data path when tx ... are on", so I think > > > it's okay? Not sure though. > > > > > > > For tap interface, it can't support TSO, do you Redhat guys have > > > > plan to enable it on kernel side. > > > > > > Yeah, currently it only works in one direction (OvS -> kernel). I am > > > looking into this now. > > > > With TSO enabled on the TAP device: > > > > Traffic Direction TSO disabled TSO enabled > > VM->tap 2.98 Gbits/sec 22.9 Gbits/sec > > tap->VM 2.29 Gbits/sec 18.0 Gbits/sec > > > > The code is in my github branch: > > https://github.com/fleitner/ovs/tree/tso-tap-enable-tx-v1 > > > > commit 884371df3bf3df836d4c2ab2d62b420339691fe8 > > Author: Flavio Leitner <f...@sysclose.org> > > Date: Tue Feb 4 11:18:49 2020 -0300 > > > > netdev-linux: Enable TSO in the TAP device. > > > > Use ioctl TUNSETOFFLOAD if kernel supports to enable TSO > > offloading in the tap device. > > > > Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support") > > Signed-off-by: Flavio Leitner <f...@sysclose.org> > > > > Please try it out and let me know if it works for you as well. > > Thanks > > fbl > > > > > > > VM->veth 2.96 Gbits/sec 22.6 Gbits/sec > > > veth->VM 2.30 Gbits/sec 9.58 Gbits/sec > > > > > > > > > See below my iperf3 results as a reference: > > > > > > Traffic Direction TSO disabled TSO enabled > > > VM->tap 2.98 Gbits/sec 22.7 Gbits/sec > > > VM->veth 2.96 Gbits/sec 22.6 Gbits/sec > > > veth->VM 2.30 Gbits/sec 9.58 Gbits/sec > > > tap->VM 2.29 Gbits/sec 2.19 Gbits/sec > > > > > > fbl > > > > > > > -----邮件原件----- > > > > 发件人: Flavio Leitner [mailto:f...@sysclose.org] > > > > 发送时间: 2020年2月4日 5:46 > > > > 收件人: d...@openvswitch.org > > > > 抄送: Stokes Ian <ian.sto...@intel.com>; Loftus Ciara > > > > <ciara.lof...@intel.com>; Ilya Maximets <i.maxim...@ovn.org>; Yi > > > > Yang (杨 > > > > ?D)-云服务集团 <yangy...@inspur.com>; txfh2007 <txfh2...@aliyun.com>; > > > > Ben Pfaff <b...@ovn.org>; Flavio Leitner <f...@sysclose.org> > > > > 主题: [PATCH v2] netdev-linux: Prepend the std packet in the TSO > > > > packet > > > > > > > > Usually TSO packets are close to 50k, 60k bytes long, so to to > > > > copy less bytes when receiving a packet from the kernel change the > > > > approach. Instead of extending the MTU sized packet received and > > > > append with remaining TSO data from the TSO buffer, allocate a TSO > > > > packet with enough headroom to prepend the std packet data. > > > > > > > > Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload > > > > support") > > > > Suggested-by: Ben Pfaff <b...@ovn.org> > > > > Signed-off-by: Flavio Leitner <f...@sysclose.org> > > > > --- > > > > lib/dp-packet.c | 8 +-- > > > > lib/dp-packet.h | 2 + > > > > lib/netdev-linux-private.h | 3 +- > > > > lib/netdev-linux.c | 117 ++++++++++++++++++++++--------------- > > > > 4 files changed, 78 insertions(+), 52 deletions(-) > > > > > > > > V2: > > > > - tso packets tailroom depends on headroom in netdev_linux_rxq_recv() > > > > - iov_len uses packet's tailroom. > > > > > > > > This patch depends on a previous posted patch to work: > > > > Subject: netdev-linux-private: fix max length to be 16 bits > > > > > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2020-February/367469. > > > > html > > > > > > > > With both patches applied, I can run iperf3 and scp on both directions > > > > with good performance and no issues. > > > > > > > > diff --git a/lib/dp-packet.c b/lib/dp-packet.c index > > > > 8dfedcb7c..cd2623500 > > > > 100644 > > > > --- a/lib/dp-packet.c > > > > +++ b/lib/dp-packet.c > > > > @@ -243,8 +243,8 @@ dp_packet_copy__(struct dp_packet *b, uint8_t > > > > *new_base, > > > > > > > > /* Reallocates 'b' so that it has exactly 'new_headroom' and > > > > 'new_tailroom' > > > > * bytes of headroom and tailroom, respectively. */ -static void > > > > -dp_packet_resize__(struct dp_packet *b, size_t new_headroom, > > > > size_t > > > > new_tailroom) > > > > +void > > > > +dp_packet_resize(struct dp_packet *b, size_t new_headroom, size_t > > > > +new_tailroom) > > > > { > > > > void *new_base, *new_data; > > > > size_t new_allocated; > > > > @@ -297,7 +297,7 @@ void > > > > dp_packet_prealloc_tailroom(struct dp_packet *b, size_t size) { > > > > if (size > dp_packet_tailroom(b)) { > > > > - dp_packet_resize__(b, dp_packet_headroom(b), MAX(size, 64)); > > > > + dp_packet_resize(b, dp_packet_headroom(b), MAX(size, > > > > + 64)); > > > > } > > > > } > > > > > > > > @@ -308,7 +308,7 @@ void > > > > dp_packet_prealloc_headroom(struct dp_packet *b, size_t size) { > > > > if (size > dp_packet_headroom(b)) { > > > > - dp_packet_resize__(b, MAX(size, 64), dp_packet_tailroom(b)); > > > > + dp_packet_resize(b, MAX(size, 64), > > > > + dp_packet_tailroom(b)); > > > > } > > > > } > > > > > > > > diff --git a/lib/dp-packet.h b/lib/dp-packet.h index > > > > 69ae5dfac..9a9d35183 > > > > 100644 > > > > --- a/lib/dp-packet.h > > > > +++ b/lib/dp-packet.h > > > > @@ -152,6 +152,8 @@ struct dp_packet > > > > *dp_packet_clone_with_headroom(const > > > > struct dp_packet *, struct dp_packet *dp_packet_clone_data(const > > > > void *, size_t); struct dp_packet > > > > *dp_packet_clone_data_with_headroom(const void *, size_t, > > > > size_t > > > > headroom); > > > > +void dp_packet_resize(struct dp_packet *b, size_t new_headroom, > > > > + size_t new_tailroom); > > > > static inline void dp_packet_delete(struct dp_packet *); > > > > > > > > static inline void *dp_packet_at(const struct dp_packet *, size_t > > > > offset, diff --git a/lib/netdev-linux-private.h > > > > b/lib/netdev-linux-private.h index > > > > be2d7b10b..c7c515f70 100644 > > > > --- a/lib/netdev-linux-private.h > > > > +++ b/lib/netdev-linux-private.h > > > > @@ -45,7 +45,8 @@ struct netdev_rxq_linux { > > > > struct netdev_rxq up; > > > > bool is_tap; > > > > int fd; > > > > - char *aux_bufs[NETDEV_MAX_BURST]; /* Batch of preallocated TSO > > > > buffers. > > > > */ > > > > + struct dp_packet *aux_bufs[NETDEV_MAX_BURST]; /* Preallocated TSO > > > > + packets. */ > > > > }; > > > > > > > > int netdev_linux_construct(struct netdev *); diff --git > > > > a/lib/netdev-linux. > > > > c b/lib/netdev-linux.c index 6add3e2fc..c6f3d2740 100644 > > > > --- a/lib/netdev-linux.c > > > > +++ b/lib/netdev-linux.c > > > > @@ -1052,15 +1052,6 @@ static struct netdev_rxq * > > > > netdev_linux_rxq_alloc(void) > > > > { > > > > struct netdev_rxq_linux *rx = xzalloc(sizeof *rx); > > > > - if (userspace_tso_enabled()) { > > > > - int i; > > > > - > > > > - /* Allocate auxiliay buffers to receive TSO packets. */ > > > > - for (i = 0; i < NETDEV_MAX_BURST; i++) { > > > > - rx->aux_bufs[i] = xmalloc(LINUX_RXQ_TSO_MAX_LEN); > > > > - } > > > > - } > > > > - > > > > return &rx->up; > > > > } > > > > > > > > @@ -1172,7 +1163,7 @@ netdev_linux_rxq_destruct(struct netdev_rxq *rxq_) > > > > } > > > > > > > > for (i = 0; i < NETDEV_MAX_BURST; i++) { > > > > - free(rx->aux_bufs[i]); > > > > + dp_packet_delete(rx->aux_bufs[i]); > > > > } > > > > } > > > > > > > > @@ -1238,13 +1229,18 @@ netdev_linux_batch_rxq_recv_sock(struct > > > > netdev_rxq_linux *rx, int mtu, > > > > virtio_net_hdr_size = 0; > > > > } > > > > > > > > - std_len = VLAN_ETH_HEADER_LEN + mtu + virtio_net_hdr_size; > > > > + /* The length here needs to be accounted in the same way when the > > > > + * aux_buf is allocated so that it can be prepended to TSO buffer. > > > > */ > > > > + std_len = virtio_net_hdr_size + VLAN_ETH_HEADER_LEN + mtu; > > > > for (i = 0; i < NETDEV_MAX_BURST; i++) { > > > > buffers[i] = dp_packet_new_with_headroom(std_len, > > > > DP_NETDEV_HEADROOM); > > > > iovs[i][IOV_PACKET].iov_base = dp_packet_data(buffers[i]); > > > > iovs[i][IOV_PACKET].iov_len = std_len; > > > > - iovs[i][IOV_AUXBUF].iov_base = rx->aux_bufs[i]; > > > > - iovs[i][IOV_AUXBUF].iov_len = LINUX_RXQ_TSO_MAX_LEN; > > > > + if (iovlen == IOV_TSO_SIZE) { > > > > + iovs[i][IOV_AUXBUF].iov_base = > > > > dp_packet_data(rx->aux_bufs[i]); > > > > + iovs[i][IOV_AUXBUF].iov_len = > > > > dp_packet_tailroom(rx->aux_bufs[i]); > > > > + } > > > > + > > > > mmsgs[i].msg_hdr.msg_name = NULL; > > > > mmsgs[i].msg_hdr.msg_namelen = 0; > > > > mmsgs[i].msg_hdr.msg_iov = iovs[i]; @@ -1268,6 +1264,8 > > > > @@ netdev_linux_batch_rxq_recv_sock(struct netdev_rxq_linux *rx, int > > > > mtu, > > > > } > > > > > > > > for (i = 0; i < retval; i++) { > > > > + struct dp_packet *pkt; > > > > + > > > > if (mmsgs[i].msg_len < ETH_HEADER_LEN) { > > > > struct netdev *netdev_ = netdev_rxq_get_netdev(&rx->up); > > > > struct netdev_linux *netdev = > > > > netdev_linux_cast(netdev_); @@ > > > > -1280,29 +1278,29 @@ netdev_linux_batch_rxq_recv_sock(struct > > > > netdev_rxq_linux *rx, int mtu, > > > > } > > > > > > > > if (mmsgs[i].msg_len > std_len) { > > > > - /* Build a single linear TSO packet by expanding the > > > > current > > > > packet > > > > - * to append the data received in the aux_buf. */ > > > > - size_t extra_len = mmsgs[i].msg_len - std_len; > > > > - > > > > - dp_packet_set_size(buffers[i], dp_packet_size(buffers[i]) > > > > - + std_len); > > > > - dp_packet_prealloc_tailroom(buffers[i], extra_len); > > > > - memcpy(dp_packet_tail(buffers[i]), rx->aux_bufs[i], > > > > extra_len); > > > > - dp_packet_set_size(buffers[i], dp_packet_size(buffers[i]) > > > > - + extra_len); > > > > - } else { > > > > - dp_packet_set_size(buffers[i], dp_packet_size(buffers[i]) > > > > - + mmsgs[i].msg_len); > > > > - } > > > > + /* Build a single linear TSO packet by prepending the data > > > > from > > > > + * std_len buffer to the aux_buf. */ > > > > + pkt = rx->aux_bufs[i]; > > > > + dp_packet_set_size(pkt, mmsgs[i].msg_len - std_len); > > > > + dp_packet_push(pkt, dp_packet_data(buffers[i]), std_len); > > > > + /* The headroom should be the same in buffers[i], pkt and > > > > + * DP_NETDEV_HEADROOM. */ > > > > + dp_packet_resize(pkt, DP_NETDEV_HEADROOM, 0); > > > > + dp_packet_delete(buffers[i]); > > > > + rx->aux_bufs[i] = NULL; > > > > + } else { > > > > + dp_packet_set_size(buffers[i], mmsgs[i].msg_len); > > > > + pkt = buffers[i]; > > > > + } > > > > > > > > - if (virtio_net_hdr_size && > > > > netdev_linux_parse_vnet_hdr(buffers[i])) > > > > { > > > > + if (virtio_net_hdr_size && > > > > + netdev_linux_parse_vnet_hdr(pkt)) { > > > > struct netdev *netdev_ = netdev_rxq_get_netdev(&rx->up); > > > > struct netdev_linux *netdev = > > > > netdev_linux_cast(netdev_); > > > > > > > > /* Unexpected error situation: the virtio header is not > > > > present > > > > * or corrupted. Drop the packet but continue in case next > > > > ones > > > > * are correct. */ > > > > - dp_packet_delete(buffers[i]); > > > > + dp_packet_delete(pkt); > > > > netdev->rx_dropped += 1; > > > > VLOG_WARN_RL(&rl, "%s: Dropped packet: Invalid virtio > > > > net header", > > > > netdev_get_name(netdev_)); @@ -1325,16 > > > > +1323,16 @@ netdev_linux_batch_rxq_recv_sock(struct > > > > +netdev_rxq_linux *rx, int mtu, > > > > struct eth_header *eth; > > > > bool double_tagged; > > > > > > > > - eth = dp_packet_data(buffers[i]); > > > > + eth = dp_packet_data(pkt); > > > > double_tagged = eth->eth_type == > > > > htons(ETH_TYPE_VLAN_8021Q); > > > > > > > > - eth_push_vlan(buffers[i], > > > > + eth_push_vlan(pkt, > > > > auxdata_to_vlan_tpid(aux, double_tagged), > > > > htons(aux->tp_vlan_tci)); > > > > break; > > > > } > > > > } > > > > - dp_packet_batch_add(batch, buffers[i]); > > > > + dp_packet_batch_add(batch, pkt); > > > > } > > > > > > > > /* Delete unused buffers. */ > > > > @@ -1354,7 +1352,6 @@ static int > > > > netdev_linux_batch_rxq_recv_tap(struct netdev_rxq_linux *rx, int mtu, > > > > struct dp_packet_batch *batch) { > > > > - struct dp_packet *buffer; > > > > int virtio_net_hdr_size; > > > > ssize_t retval; > > > > size_t std_len; > > > > @@ -1372,16 +1369,22 @@ netdev_linux_batch_rxq_recv_tap(struct > > > > netdev_rxq_linux *rx, int mtu, > > > > virtio_net_hdr_size = 0; > > > > } > > > > > > > > - std_len = VLAN_ETH_HEADER_LEN + mtu + virtio_net_hdr_size; > > > > + /* The length here needs to be accounted in the same way when the > > > > + * aux_buf is allocated so that it can be prepended to TSO buffer. > > > > */ > > > > + std_len = virtio_net_hdr_size + VLAN_ETH_HEADER_LEN + mtu; > > > > for (i = 0; i < NETDEV_MAX_BURST; i++) { > > > > + struct dp_packet *buffer; > > > > + struct dp_packet *pkt; > > > > struct iovec iov[IOV_TSO_SIZE]; > > > > > > > > /* Assume Ethernet port. No need to set packet_type. */ > > > > buffer = dp_packet_new_with_headroom(std_len, > > > > DP_NETDEV_HEADROOM); > > > > iov[IOV_PACKET].iov_base = dp_packet_data(buffer); > > > > iov[IOV_PACKET].iov_len = std_len; > > > > - iov[IOV_AUXBUF].iov_base = rx->aux_bufs[i]; > > > > - iov[IOV_AUXBUF].iov_len = LINUX_RXQ_TSO_MAX_LEN; > > > > + if (iovlen == IOV_TSO_SIZE) { > > > > + iov[IOV_AUXBUF].iov_base = dp_packet_data(rx->aux_bufs[i]); > > > > + iov[IOV_AUXBUF].iov_len = > > > > dp_packet_tailroom(rx->aux_bufs[i]); > > > > + } > > > > > > > > do { > > > > retval = readv(rx->fd, iov, iovlen); @@ -1393,33 > > > > +1396,36 @@ netdev_linux_batch_rxq_recv_tap(struct > > > > +netdev_rxq_linux *rx, int mtu, > > > > } > > > > > > > > if (retval > std_len) { > > > > - /* Build a single linear TSO packet by expanding the > > > > current > > > > packet > > > > - * to append the data received in the aux_buf. */ > > > > - size_t extra_len = retval - std_len; > > > > - > > > > - dp_packet_set_size(buffer, dp_packet_size(buffer) + > > > > std_len); > > > > - dp_packet_prealloc_tailroom(buffer, extra_len); > > > > - memcpy(dp_packet_tail(buffer), rx->aux_bufs[i], extra_len); > > > > - dp_packet_set_size(buffer, dp_packet_size(buffer) + > > > > extra_len); > > > > + /* Build a single linear TSO packet by prepending the data > > > > from > > > > + * std_len buffer to the aux_buf. */ > > > > + pkt = rx->aux_bufs[i]; > > > > + dp_packet_set_size(pkt, retval - std_len); > > > > + dp_packet_push(pkt, dp_packet_data(buffer), std_len); > > > > + /* The headroom should be the same in buffers[i], pkt and > > > > + * DP_NETDEV_HEADROOM. */ > > > > + dp_packet_resize(pkt, DP_NETDEV_HEADROOM, 0); > > > > + dp_packet_delete(buffer); > > > > + rx->aux_bufs[i] = NULL; > > > > } else { > > > > dp_packet_set_size(buffer, dp_packet_size(buffer) + > > > > retval); > > > > + pkt = buffer; > > > > } > > > > > > > > - if (virtio_net_hdr_size && > > > > netdev_linux_parse_vnet_hdr(buffer)) { > > > > + if (virtio_net_hdr_size && > > > > + netdev_linux_parse_vnet_hdr(pkt)) { > > > > struct netdev *netdev_ = netdev_rxq_get_netdev(&rx->up); > > > > struct netdev_linux *netdev = > > > > netdev_linux_cast(netdev_); > > > > > > > > /* Unexpected error situation: the virtio header is not > > > > present > > > > * or corrupted. Drop the packet but continue in case next > > > > ones > > > > * are correct. */ > > > > - dp_packet_delete(buffer); > > > > + dp_packet_delete(pkt); > > > > netdev->rx_dropped += 1; > > > > VLOG_WARN_RL(&rl, "%s: Dropped packet: Invalid virtio > > > > net header", > > > > netdev_get_name(netdev_)); > > > > continue; > > > > } > > > > > > > > - dp_packet_batch_add(batch, buffer); > > > > + dp_packet_batch_add(batch, pkt); > > > > } > > > > > > > > if ((i == 0) && (retval < 0)) { @@ -1442,6 +1448,23 @@ > > > > netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct > > > > dp_packet_batch *batch, > > > > mtu = ETH_PAYLOAD_MAX; > > > > } > > > > > > > > + if (userspace_tso_enabled()) { > > > > + /* Allocate TSO packets. The packet has enough headroom to > > > > store > > > > + * a full non-TSO packet. When a TSO packet is received, the > > > > data > > > > + * from non-TSO buffer (std_len) is prepended to the TSO packet > > > > + * (aux_buf). */ > > > > + size_t std_len = sizeof(struct virtio_net_hdr) + > > > > VLAN_ETH_HEADER_LEN > > > > + + DP_NETDEV_HEADROOM + mtu; > > > > + size_t data_len = LINUX_RXQ_TSO_MAX_LEN - std_len; > > > > + for (int i = 0; i < NETDEV_MAX_BURST; i++) { > > > > + if (rx->aux_bufs[i]) { > > > > + continue; > > > > + } > > > > + > > > > + rx->aux_bufs[i] = > > > > + dp_packet_new_with_headroom(data_len, > > > > std_len); > > > > + } > > > > + } > > > > + > > > > dp_packet_batch_init(batch); > > > > retval = (rx->is_tap > > > > ? netdev_linux_batch_rxq_recv_tap(rx, mtu, batch) > > > > -- > > > > 2.24.1 > > > > > > > > > > > > > > > > -- > > > fbl > > > _______________________________________________ > > > dev mailing list > > > d...@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > -- > > fbl > > > > -- > fbl -- fbl _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev