Re: [ovs-dev] [PATCH] Use batch process recv for tap and raw socket in netdev datapath

2019-12-20 Thread William Tu
On Wed, Dec 18, 2019 at 10:44:21AM +0800, yang_y_yi wrote:
> Hi, William
> 
> 
> I used OVS DPDK to test it, you shouldn't add tap interface to ovs DPDK 
> bridge if you use vdev to add tap, virtio_user is just for it, but that won't 
> use this receive function to receive packets.

Right.
I mean if you already use OVS-DPDK, you can create tap device using s.t like

ovs-vsctl -- set interface dpdk-p0 type=dpdk \
 options:dpdk-devargs=vdev:net_af_packet0,iface=dpdk-p0

Then you can get better veth performance around 2.3Gbps, without your patch.

William

> 
> At 2019-12-17 02:55:50, "William Tu"  wrote:
> >On Fri, Dec 06, 2019 at 02:09:24AM -0500, yang_y...@163.com wrote:
> >> From: Yi Yang 
> >> 
> >> Current netdev_linux_rxq_recv_tap and netdev_linux_rxq_recv_sock
> >> just receive single packet, that is very inefficient, per my test
> >> case which adds two tap ports or veth ports into OVS bridge
> >> (datapath_type=netdev) and use iperf3 to do performance test
> >> between two ports (they are set into different network name space).
> >> 
> >> The result is as below:
> >> 
> >>   tap:  295 Mbits/sec
> >>   veth: 207 Mbits/sec
> >> 
> >> After I change netdev_linux_rxq_recv_tap and
> >> netdev_linux_rxq_recv_sock to use batch process, the performance
> >> is boosted by about 7 times, here is the result:
> >> 
> >>   tap:  1.96 Gbits/sec
> >>   veth: 1.47 Gbits/sec
> >> 
> >> Undoubtedly this is a huge improvement although it can't match
> >> OVS kernel datapath yet.
> >> 
> >> FYI: here is thr result for OVS kernel datapath:
> >> 
> >>   tap:  37.2 Gbits/sec
> >>   veth: 36.3 Gbits/sec
> >> 
> >> Note: performance result is highly related with your test machine
> >> , you shouldn't expect the same results on your test machine.
> >> 
> >> Signed-off-by: Yi Yang 
> >
> >Hi Yi Yang,
> >
> >Are you testing this using OVS-DPDK?
> >If you're using OVS-DPDK, then you should use DPDK's vdev to
> >open and attach tap/veth device to OVS. I think you'll see much
> >better performance.
> >
> >The performance issue you pointed out only happens when using
> >userspace datapath without DPDK library, where afxdp is used.
> >I'm still looking for a better solutions for faster interface
> >for veth (af_packet) and tap.
> >
> >Thanks
> >William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use batch process recv for tap and raw socket in netdev datapath

2019-12-17 Thread yang_y_yi
Hi, William


I used OVS DPDK to test it, you shouldn't add tap interface to ovs DPDK bridge 
if you use vdev to add tap, virtio_user is just for it, but that won't use this 
receive function to receive packets.

At 2019-12-17 02:55:50, "William Tu"  wrote:
>On Fri, Dec 06, 2019 at 02:09:24AM -0500, yang_y...@163.com wrote:
>> From: Yi Yang 
>> 
>> Current netdev_linux_rxq_recv_tap and netdev_linux_rxq_recv_sock
>> just receive single packet, that is very inefficient, per my test
>> case which adds two tap ports or veth ports into OVS bridge
>> (datapath_type=netdev) and use iperf3 to do performance test
>> between two ports (they are set into different network name space).
>> 
>> The result is as below:
>> 
>>   tap:  295 Mbits/sec
>>   veth: 207 Mbits/sec
>> 
>> After I change netdev_linux_rxq_recv_tap and
>> netdev_linux_rxq_recv_sock to use batch process, the performance
>> is boosted by about 7 times, here is the result:
>> 
>>   tap:  1.96 Gbits/sec
>>   veth: 1.47 Gbits/sec
>> 
>> Undoubtedly this is a huge improvement although it can't match
>> OVS kernel datapath yet.
>> 
>> FYI: here is thr result for OVS kernel datapath:
>> 
>>   tap:  37.2 Gbits/sec
>>   veth: 36.3 Gbits/sec
>> 
>> Note: performance result is highly related with your test machine
>> , you shouldn't expect the same results on your test machine.
>> 
>> Signed-off-by: Yi Yang 
>
>Hi Yi Yang,
>
>Are you testing this using OVS-DPDK?
>If you're using OVS-DPDK, then you should use DPDK's vdev to
>open and attach tap/veth device to OVS. I think you'll see much
>better performance.
>
>The performance issue you pointed out only happens when using
>userspace datapath without DPDK library, where afxdp is used.
>I'm still looking for a better solutions for faster interface
>for veth (af_packet) and tap.
>
>Thanks
>William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use batch process recv for tap and raw socket in netdev datapath

2019-12-17 Thread Ben Pfaff
On Fri, Dec 06, 2019 at 02:09:24AM -0500, yang_y...@163.com wrote:
> From: Yi Yang 
> 
> Current netdev_linux_rxq_recv_tap and netdev_linux_rxq_recv_sock
> just receive single packet, that is very inefficient, per my test
> case which adds two tap ports or veth ports into OVS bridge
> (datapath_type=netdev) and use iperf3 to do performance test
> between two ports (they are set into different network name space).

Thanks for the patch!  This is an impressive performance improvement!

Each call to netdev_linux_batch_rxq_recv_sock() now calls malloc() 32
times.  This is expensive if only a few packets (or none) are received.
Maybe it doesn't matter, but I wonder whether it affects performance.

I think that no packets are freed on error.  Fix:

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 9cb45d5c7d29..3414a6495ced 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1198,6 +1198,7 @@ netdev_linux_batch_rxq_recv_sock(int fd, int mtu,
 if (retval < 0) {
 /* Save -errno to retval temporarily */
 retval = -errno;
+i = 0;
 goto free_buffers;
 }
 

To get sparse to work, one must fold in the following:

diff --git a/include/sparse/sys/socket.h b/include/sparse/sys/socket.h
index 4178f57e2bda..e954ade714b5 100644
--- a/include/sparse/sys/socket.h
+++ b/include/sparse/sys/socket.h
@@ -27,6 +27,7 @@
 
 typedef unsigned short int sa_family_t;
 typedef __socklen_t socklen_t;
+struct timespec;
 
 struct sockaddr {
 sa_family_t sa_family;
@@ -171,4 +172,7 @@ int sockatmark(int);
 int socket(int, int, int);
 int socketpair(int, int, int, int[2]);
 
+int sendmmsg(int, struct mmsghdr *, unsigned int, int);
+int recvmmsg(int, struct mmsghdr *, unsigned int, int, struct timespec *);
+
 #endif /*  for sparse */
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use batch process recv for tap and raw socket in netdev datapath

2019-12-16 Thread William Tu
On Fri, Dec 06, 2019 at 02:09:24AM -0500, yang_y...@163.com wrote:
> From: Yi Yang 
> 
> Current netdev_linux_rxq_recv_tap and netdev_linux_rxq_recv_sock
> just receive single packet, that is very inefficient, per my test
> case which adds two tap ports or veth ports into OVS bridge
> (datapath_type=netdev) and use iperf3 to do performance test
> between two ports (they are set into different network name space).
> 
> The result is as below:
> 
>   tap:  295 Mbits/sec
>   veth: 207 Mbits/sec
> 
> After I change netdev_linux_rxq_recv_tap and
> netdev_linux_rxq_recv_sock to use batch process, the performance
> is boosted by about 7 times, here is the result:
> 
>   tap:  1.96 Gbits/sec
>   veth: 1.47 Gbits/sec
> 
> Undoubtedly this is a huge improvement although it can't match
> OVS kernel datapath yet.
> 
> FYI: here is thr result for OVS kernel datapath:
> 
>   tap:  37.2 Gbits/sec
>   veth: 36.3 Gbits/sec
> 
> Note: performance result is highly related with your test machine
> , you shouldn't expect the same results on your test machine.
> 
> Signed-off-by: Yi Yang 

Hi Yi Yang,

Are you testing this using OVS-DPDK?
If you're using OVS-DPDK, then you should use DPDK's vdev to
open and attach tap/veth device to OVS. I think you'll see much
better performance.

The performance issue you pointed out only happens when using
userspace datapath without DPDK library, where afxdp is used.
I'm still looking for a better solutions for faster interface
for veth (af_packet) and tap.

Thanks
William

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use batch process recv for tap and raw socket in netdev datapath

2019-12-06 Thread William Tu
On Thu, Dec 5, 2019 at 11:09 PM  wrote:
>
> From: Yi Yang 
>
> Current netdev_linux_rxq_recv_tap and netdev_linux_rxq_recv_sock
> just receive single packet, that is very inefficient, per my test
> case which adds two tap ports or veth ports into OVS bridge
> (datapath_type=netdev) and use iperf3 to do performance test
> between two ports (they are set into different network name space).
>
> The result is as below:
>
>   tap:  295 Mbits/sec
>   veth: 207 Mbits/sec
>
> After I change netdev_linux_rxq_recv_tap and
> netdev_linux_rxq_recv_sock to use batch process, the performance
> is boosted by about 7 times, here is the result:
>
>   tap:  1.96 Gbits/sec
>   veth: 1.47 Gbits/sec
>
> Undoubtedly this is a huge improvement although it can't match
> OVS kernel datapath yet.
>
> FYI: here is thr result for OVS kernel datapath:
>
>   tap:  37.2 Gbits/sec
>   veth: 36.3 Gbits/sec
>
> Note: performance result is highly related with your test machine
> , you shouldn't expect the same results on your test machine.

Hi Yi Yang,

Thanks for the patch, it's amazing with so much performance improvement.
I haven't reviewed the code but Yifeng and I applied and tested this patch.
Using netdev-afxdp + tap port, we do see performance improves from
300Mbps to 2Gbps in our testbed!

Will add more feedback next week.
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Use batch process recv for tap and raw socket in netdev datapath

2019-12-05 Thread yang_y_yi
From: Yi Yang 

Current netdev_linux_rxq_recv_tap and netdev_linux_rxq_recv_sock
just receive single packet, that is very inefficient, per my test
case which adds two tap ports or veth ports into OVS bridge
(datapath_type=netdev) and use iperf3 to do performance test
between two ports (they are set into different network name space).

The result is as below:

  tap:  295 Mbits/sec
  veth: 207 Mbits/sec

After I change netdev_linux_rxq_recv_tap and
netdev_linux_rxq_recv_sock to use batch process, the performance
is boosted by about 7 times, here is the result:

  tap:  1.96 Gbits/sec
  veth: 1.47 Gbits/sec

Undoubtedly this is a huge improvement although it can't match
OVS kernel datapath yet.

FYI: here is thr result for OVS kernel datapath:

  tap:  37.2 Gbits/sec
  veth: 36.3 Gbits/sec

Note: performance result is highly related with your test machine
, you shouldn't expect the same results on your test machine.

Signed-off-by: Yi Yang 
---
 lib/netdev-linux.c | 166 ++---
 1 file changed, 108 insertions(+), 58 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index f8e59ba..9cb45d5 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1151,90 +1151,146 @@ auxdata_has_vlan_tci(const struct tpacket_auxdata *aux)
 return aux->tp_vlan_tci || aux->tp_status & TP_STATUS_VLAN_VALID;
 }
 
+/*
+ * Receive packets from raw socket in batch process for better performance,
+ * it can receive NETDEV_MAX_BURST packets at most once, the received
+ * packets are added into *batch. The return value is 0 or errno.
+ *
+ * It also used recvmmsg to reduce multiple syscalls overhead;
+ */
 static int
-netdev_linux_rxq_recv_sock(int fd, struct dp_packet *buffer)
+netdev_linux_batch_rxq_recv_sock(int fd, int mtu,
+ struct dp_packet_batch *batch)
 {
 size_t size;
 ssize_t retval;
-struct iovec iov;
+struct iovec iovs[NETDEV_MAX_BURST];
 struct cmsghdr *cmsg;
 union {
 struct cmsghdr cmsg;
 char buffer[CMSG_SPACE(sizeof(struct tpacket_auxdata))];
-} cmsg_buffer;
-struct msghdr msgh;
-
-/* Reserve headroom for a single VLAN tag */
-dp_packet_reserve(buffer, VLAN_HEADER_LEN);
-size = dp_packet_tailroom(buffer);
-
-iov.iov_base = dp_packet_data(buffer);
-iov.iov_len = size;
-msgh.msg_name = NULL;
-msgh.msg_namelen = 0;
-msgh.msg_iov = &iov;
-msgh.msg_iovlen = 1;
-msgh.msg_control = &cmsg_buffer;
-msgh.msg_controllen = sizeof cmsg_buffer;
-msgh.msg_flags = 0;
+} cmsg_buffers[NETDEV_MAX_BURST];
+struct mmsghdr mmsgs[NETDEV_MAX_BURST];
+struct dp_packet *buffers[NETDEV_MAX_BURST];
+int i;
+
+for (i = 0; i < NETDEV_MAX_BURST; i++) {
+ buffers[i] = dp_packet_new_with_headroom(VLAN_ETH_HEADER_LEN + mtu,
+  DP_NETDEV_HEADROOM);
+ /* Reserve headroom for a single VLAN tag */
+ dp_packet_reserve(buffers[i], VLAN_HEADER_LEN);
+ size = dp_packet_tailroom(buffers[i]);
+ iovs[i].iov_base = dp_packet_data(buffers[i]);
+ iovs[i].iov_len = size;
+ mmsgs[i].msg_hdr.msg_name = NULL;
+ mmsgs[i].msg_hdr.msg_namelen = 0;
+ mmsgs[i].msg_hdr.msg_iov = &iovs[i];
+ mmsgs[i].msg_hdr.msg_iovlen = 1;
+ mmsgs[i].msg_hdr.msg_control = &cmsg_buffers[i];
+ mmsgs[i].msg_hdr.msg_controllen = sizeof cmsg_buffers[i];
+ mmsgs[i].msg_hdr.msg_flags = 0;
+}
 
 do {
-retval = recvmsg(fd, &msgh, MSG_TRUNC);
+retval = recvmmsg(fd, mmsgs, NETDEV_MAX_BURST, MSG_TRUNC, NULL);
 } while (retval < 0 && errno == EINTR);
 
 if (retval < 0) {
-return errno;
-} else if (retval > size) {
-return EMSGSIZE;
+/* Save -errno to retval temporarily */
+retval = -errno;
+goto free_buffers;
 }
 
-dp_packet_set_size(buffer, dp_packet_size(buffer) + retval);
-
-for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg; cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
-const struct tpacket_auxdata *aux;
-
-if (cmsg->cmsg_level != SOL_PACKET
-|| cmsg->cmsg_type != PACKET_AUXDATA
-|| cmsg->cmsg_len < CMSG_LEN(sizeof(struct tpacket_auxdata))) {
-continue;
+for (i = 0; i < retval; i++) {
+if (mmsgs[i].msg_len < ETH_HEADER_LEN) {
+break;
 }
 
-aux = ALIGNED_CAST(struct tpacket_auxdata *, CMSG_DATA(cmsg));
-if (auxdata_has_vlan_tci(aux)) {
-struct eth_header *eth;
-bool double_tagged;
+dp_packet_set_size(buffers[i],
+   dp_packet_size(buffers[i]) + mmsgs[i].msg_len);
+
+for (cmsg = CMSG_FIRSTHDR(&mmsgs[i].msg_hdr); cmsg;
+ cmsg = CMSG_NXTHDR(&mmsgs[i].msg_hdr, cmsg)) {
+const struct tpacket_auxdata *aux;
 
-if (retval < ETH_HEADER_LEN) {
-return EINVAL;
+