Re: [ovs-dev] [PATCHv2 1/2] userspace: Enable TSO support for non-DPDK.

2020-02-20 Thread William Tu
On Thu, Feb 20, 2020 at 4:50 PM Yi Yang (杨燚)-云服务集团  wrote:
>
> William, which kernel version did you use to test for this patch? I don't
> want to build a kernel if Ubuntu 16.04 kernel can work.
>

I don't think this patch requires some new kernel feature.
I'm using ubuntu 1604.
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] 答复: 答复: [PATCH v4 0/3] Add support for TSO with DPDK

2020-02-20 Thread 杨燚
Very weird, I built 4.15.9, it is from upstream kernel, the result is same, 
what's wrong? I can't understand. I directly used current ovs master for this 
time.

$ ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 54078 connected to 10.15.1.3 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  4]   0.00-10.00  sec  6.01 MBytes  5.04 Mbits/sec  1688   5.66 KBytes
[  4]  10.00-20.00  sec  6.17 MBytes  5.17 Mbits/sec  1725   7.07 KBytes
[  4]  20.00-30.00  sec  6.51 MBytes  5.46 Mbits/sec  1828   5.66 KBytes
[  4]  30.00-40.00  sec  5.58 MBytes  4.68 Mbits/sec  1509   7.07 KBytes
[  4]  40.00-50.00  sec  4.83 MBytes  4.05 Mbits/sec  1182   7.07 KBytes
[  4]  50.00-60.00  sec  4.49 MBytes  3.77 Mbits/sec  1110   5.66 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  4]   0.00-60.00  sec  33.6 MBytes  4.70 Mbits/sec  9042 sender
[  4]   0.00-60.00  sec  33.5 MBytes  4.69 Mbits/sec  receiver

Server output:
Accepted connection from 10.15.1.2, port 54076
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 54078
[ ID] Interval   Transfer Bandwidth
[  5]   0.00-10.00  sec  5.89 MBytes  4.94 Mbits/sec
[  5]  10.00-20.00  sec  6.23 MBytes  5.22 Mbits/sec
[  5]  20.00-30.00  sec  6.46 MBytes  5.42 Mbits/sec
[  5]  30.00-40.00  sec  5.62 MBytes  4.71 Mbits/sec
[  5]  40.00-50.00  sec  4.83 MBytes  4.05 Mbits/sec
[  5]  50.00-60.00  sec  4.45 MBytes  3.73 Mbits/sec


iperf Done.
$ uname -a
Linux cmp008 4.15.9 #1 SMP Fri Feb 21 09:27:41 UTC 2020 x86_64 x86_64 x86_64 
GNU/Linux
eipadmin@cmp008:~$
eipadmin@cmp008:~$ cd ovs-master/
eipadmin@cmp008:~/ovs-master$ git log
commit ac23d20fc90da3b1c9b2117d1e22102e99fba006
Author: Yi-Hung Wei 
Date:   Fri Feb 7 14:55:06 2020 -0800

conntrack: Fix TCP conntrack state

If a TCP connection is in SYN_SENT state, receiving another SYN packet
would just renew the timeout of that conntrack entry rather than create
a new one.  Thus, tcp_conn_update() should return CT_UPDATE_VALID_NEW.

This also fixes regressions of a couple of  OVN system tests.

Fixes: a867c010ee91 ("conntrack: Fix conntrack new state")
Reported-by: Dumitru Ceara 
Signed-off-by: Yi-Hung Wei 
Tested-by: Dumitru Ceara 
Signed-off-by: William Tu 

commit 486139d9e4b81dae04b2bb7487d45366865ac0ad
Author: Tomasz Konieczny 
Date:   Wed Feb 12 14:15:56 2020 +0100

docs: Update DPDK version table

Signed-off-by: Tomasz Konieczny 
Acked-by: Flavio Leitner 
Acked-by: Kevin Traynor 
Signed-off-by: Ian Stokes 

commit 9efbdaa201530ab7023a69176aba54c32c468efb
Author: Ben Pfaff 
Date:   Thu Feb 13 16:27:01 2020 -0800

Set release date for 2.13.0.

The "Valentine's Day" release.

Acked-by: Flavio Leitner 
Signed-off-by: Ben Pfaff 

commit 19e99c83bb4da4617730f20392515d8aca5b61ba
Author: Yi-Hung Wei 
$

-邮件原件-
发件人: Flavio Leitner [mailto:f...@sysclose.org] 
发送时间: 2020年2月20日 21:41
收件人: Yi Yang (杨燚)-云服务集团 
抄送: pkusunyif...@gmail.com; d...@openvswitch.org; i.maxim...@ovn.org; 
txfh2...@aliyun.com
主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

On Thu, Feb 20, 2020 at 10:10:36AM +, Yi Yang (杨 D)-云服务集团 wrote:
> Hi, Flavio
> 
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> is my result. My kernel version is
> 
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $

I tested with 4.15.0 upstream and it worked. Can you do the same?

> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval   Transfer Bandwidth   Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval   Transfer Bandwidth   Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver

That looks like TSO packets are being dropped and the traffic is basically TCP 
retransmissions of MTU size.

fbl


> 
> Server output:
> Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval   Transfer Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> 20.00-30.00  sec  7.73 MByt

[ovs-dev] 答复: [PATCHv2 1/2] userspace: Enable TSO support for non-DPDK.

2020-02-20 Thread 杨�D
William, which kernel version did you use to test for this patch? I don't
want to build a kernel if Ubuntu 16.04 kernel can work.

-邮件原件-
发件人: dev [mailto:ovs-dev-boun...@openvswitch.org] 代表 William Tu
发送时间: 2020年2月21日 3:00
收件人: d...@openvswitch.org
抄送: f...@sysclose.org; i.maxim...@ovn.org
主题: [ovs-dev] [PATCHv2 1/2] userspace: Enable TSO support for non-DPDK.

This patch enables TSO support for non-DPDK use cases, and also add
check-system-tso testsuite. Before TSO, we have to disable checksum offload,
allowing the kernel to calculate the TCP/UDP packet checsum. With TSO, we
can skip the checksum validation by enabling checksum offload, and with
large packet size, we see better performance.

Consider container to container use cases:
  iperf3 -c (ns0) -> veth peer -> OVS -> veth peer -> iperf3 -s (ns1) And I
got around 6Gbps, similar to TSO with DPDK-enabled.

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/653109097
Signed-off-by: William Tu 
---
v2:
  - add make check-system-tso test
  - combine logging for dpdk and non-dpdk
  - I'm surprised that most of the test cases passed.
This is due to few tests using tcp/udp, so it does not trigger
TSO.  I saw only geneve/vxlan fails randomly, maybe we can
check it later.
---
 lib/dp-packet.h   | 95
++-
 lib/userspace-tso.c   |  5 ---
 tests/.gitignore  |  3 ++
 tests/automake.mk | 15 +++
 tests/system-tso-macros.at| 42 +++
 tests/system-tso-testsuite.at | 26 
 6 files changed, 143 insertions(+), 43 deletions(-)  create mode 100644
tests/system-tso-macros.at  create mode 100644 tests/system-tso-testsuite.at

diff --git a/lib/dp-packet.h b/lib/dp-packet.h index
9f8991faad52..6b90cec2afb4 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -53,7 +53,25 @@ enum OVS_PACKED_ENUM dp_packet_source {  enum
dp_packet_offload_mask {
 DP_PACKET_OL_RSS_HASH_MASK  = 0x1, /* Is the 'rss_hash' valid? */
 DP_PACKET_OL_FLOW_MARK_MASK = 0x2, /* Is the 'flow_mark' valid? */
+DP_PACKET_OL_RX_L4_CKSUM_BAD = 1 << 3,
+DP_PACKET_OL_RX_IP_CKSUM_BAD = 1 << 4,
+DP_PACKET_OL_RX_L4_CKSUM_GOOD = 1 << 5,
+DP_PACKET_OL_RX_IP_CKSUM_GOOD = 1 << 6,
+DP_PACKET_OL_TX_TCP_SEG = 1 << 7,
+DP_PACKET_OL_TX_IPV4 = 1 << 8,
+DP_PACKET_OL_TX_IPV6 = 1 << 9,
+DP_PACKET_OL_TX_TCP_CKSUM = 1 << 10,
+DP_PACKET_OL_TX_UDP_CKSUM = 1 << 11,
+DP_PACKET_OL_TX_SCTP_CKSUM = 1 << 12,
 };
+
+#define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CKSUM | \
+ DP_PACKET_OL_TX_UDP_CKSUM | \
+ DP_PACKET_OL_TX_SCTP_CKSUM) #define 
+DP_PACKET_OL_RX_IP_CKSUM_MASK (DP_PACKET_OL_RX_IP_CKSUM_GOOD | \
+   DP_PACKET_OL_RX_IP_CKSUM_BAD) 
+#define DP_PACKET_OL_RX_L4_CKSUM_MASK (DP_PACKET_OL_RX_L4_CKSUM_GOOD | \
+   DP_PACKET_OL_RX_L4_CKSUM_BAD)
 #else
 /* DPDK mbuf ol_flags that are not really an offload flags.  These are
mostly
  * related to mbuf memory layout and OVS should not touch/clear them. */ @@
-739,82 +757,79 @@ dp_packet_set_allocated(struct dp_packet *b, uint16_t s)
 b->allocated_ = s;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */  static
inline bool -dp_packet_hwol_is_tso(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_is_tso(const struct dp_packet *b)
 {
-return false;
+return !!(b->ol_flags & DP_PACKET_OL_TX_TCP_SEG);
 }
 
-/* There are no implementation when not DPDK enabled datapath. */  static
inline bool -dp_packet_hwol_is_ipv4(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_is_ipv4(const struct dp_packet *b)
 {
-return false;
+return !!(b->ol_flags & DP_PACKET_OL_TX_IPV4);
 }
 
-/* There are no implementation when not DPDK enabled datapath. */  static
inline uint64_t -dp_packet_hwol_l4_mask(const struct dp_packet *b
OVS_UNUSED)
+dp_packet_hwol_l4_mask(const struct dp_packet *b)
 {
-return 0;
+return b->ol_flags & DP_PACKET_OL_TX_L4_MASK;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */  static
inline bool -dp_packet_hwol_l4_is_tcp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_tcp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_TCP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */  static
inline bool -dp_packet_hwol_l4_is_udp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_udp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_UDP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */  static
inline bool -dp_packet_hwol_l4_is_sctp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_sctp(const struct dp_packet *b)
 {
-return false;
+return (b-

[ovs-dev] 答复: 答复: [PATCH v4 0/3] Add support for TSO with DPDK

2020-02-20 Thread 杨燚
No, I didn't use VMs, just veth in netns, I doubt it is Ubuntu kernel bug.

-邮件原件-
发件人: William Tu [mailto:u9012...@gmail.com] 
发送时间: 2020年2月21日 3:21
收件人: Yi Yang (杨燚)-云服务集团 
抄送: f...@sysclose.org; pkusunyif...@gmail.com; d...@openvswitch.org; 
i.maxim...@ovn.org; txfh2...@aliyun.com
主题: Re: [ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK

On Thu, Feb 20, 2020 at 2:12 AM Yi Yang (杨燚)-云服务集团  wrote:
>
> Hi, Flavio
>
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> is my result. My kernel version is

Hi Yiyang,

I'm so confused with your description. Which case does not work for you?
Yifeng and Flavio were using OVS-DPDK with vhostuser to VM, is this the case 
you're talking about?

>
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $
>
> $ ./run-iperf3.sh

Which case is this one?

> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval   Transfer Bandwidth   Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval   Transfer Bandwidth   Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver
>
> Server output:
> Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval   Transfer Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
>
>
> iperf Done.
> $
>
> But it does work for tap, I'm not sure if it is a kernel issue, which 
> kernel
  ^^^
So which case does not work?

> version are you using? I didn't use tpacket_v3 patch. Here is my local 
> ovs info.

William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN 20.03.0 Release: Final call for patches

2020-02-20 Thread Han Zhou
On Thu, Feb 20, 2020 at 11:54 AM Mark Michelson  wrote:
>
> Hi everyone,
>
> We've had the OVN 20.03 branch created for a couple of weeks now, and
> we've seen some important issues found and fixed since then. Great job!
>
> As the branch becomes more stable, we look at the possibility of
> creating the 20.03.0 release. I think we can make it at the end of next
> week, 28 February.
>
> If there are any outstanding issues in the 20.03 branch that need
> addressing before the release of 20.03.0, please bring them up in reply
> to this thread. If you have patches that are targeted to 20.03.0 that
> have not been merged (or have been merged to master but have not been
> backported to 20.03) then please bring them up as well.
>
> Thanks,
> Mark Michelson
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Hi Mark,

Requested by ovn-k8s team, the two patches need to be backported to 20.03:

eb9a406 ovn-controller: Avoid adding neighbour flows for non-local
datapaths.
f896912 ovn-controller: Avoid creating patch port for non-local datapaths.

Please let me know if there is any concern.

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] conntrack: Fix conntrack new state

2020-02-20 Thread Yi-Hung Wei
On Thu, Feb 20, 2020 at 4:42 AM Roni Bar Yanai  wrote:
>
> Hi Yi-Hung,
>
> From the man page, it looks like the +est is the result of committing action 
> and not related to the TCP state machine itself.
> I'm aware that this is not how it works in kernel conntrack, if you want to 
> align both , the description might need an update.
>
> new (0x01)
>  A new connection. Set to 1 if this is an uncommitted con‐
>  nection.
>
>   est (0x02)
>  Part of an existing connection. Set to 1  if  this  is  a
>  committed connection.
>
Hi Roni,

Thanks for bringing this up. I will send a patch to update the documentation.

Thanks,

-Yi-Hung
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] 答复: [PATCH v2] netdev-linux: Prepend the std packet in the TSO packet

2020-02-20 Thread Flavio Leitner
On Thu, Feb 20, 2020 at 11:39:21AM -0800, William Tu wrote:
> On Tue, Feb 4, 2020 at 9:12 AM Flavio Leitner  wrote:
> >
> > On Tue, Feb 04, 2020 at 12:00:19PM -0300, Flavio Leitner wrote:
> > > On Tue, Feb 04, 2020 at 12:51:24AM +, Yi Yang (杨�D)-云服务集团 wrote:
> > > > Hi, Flavio
> > The code is in my github branch:
> > https://github.com/fleitner/ovs/tree/tso-tap-enable-tx-v1
> >
> > commit 884371df3bf3df836d4c2ab2d62b420339691fe8
> > Author: Flavio Leitner 
> > Date:   Tue Feb 4 11:18:49 2020 -0300
> >
> > netdev-linux: Enable TSO in the TAP device.
> >
> > Use ioctl TUNSETOFFLOAD if kernel supports to enable TSO
> > offloading in the tap device.
> >
> > Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> > Signed-off-by: Flavio Leitner 
> >
> Hi Flavio,
> Are you sending out this patch?
> I've tested it and looks good, improve performance a lot.

Hi William,

I was waiting for the test feedback, so thanks for providing it.
I will rebase and propose it.

-- 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] OVN 20.03.0 Release: Final call for patches

2020-02-20 Thread Mark Michelson

Hi everyone,

We've had the OVN 20.03 branch created for a couple of weeks now, and 
we've seen some important issues found and fixed since then. Great job!


As the branch becomes more stable, we look at the possibility of 
creating the 20.03.0 release. I think we can make it at the end of next 
week, 28 February.


If there are any outstanding issues in the 20.03 branch that need 
addressing before the release of 20.03.0, please bring them up in reply 
to this thread. If you have patches that are targeted to 20.03.0 that 
have not been merged (or have been merged to master but have not been 
backported to 20.03) then please bring them up as well.


Thanks,
Mark Michelson

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-controller: Avoid creating patch port for non-local datapaths.

2020-02-20 Thread Han Zhou
On Wed, Feb 19, 2020 at 4:45 AM Dumitru Ceara  wrote:
>
> On 2/18/20 11:26 PM, Han Zhou wrote:
> > When external-ids:ovn-monitor-all is set to true, patch ports for
> > non-local datapaths may be created, which is unnecessary and
> > confusing.  This patch avoids that by checking if a localnet port
> > belongs to local datapaths before creating the patch port.  It
> > also moves patch_run() in mainloop after engine_run(), so that
> > local_datapaths already reflects the up-to-date situation when
> > patch_run() is called.
> >
> > Reported-by: Girish Moodalbail 
> > Signed-off-by: Han Zhou 
>
> Hi Han,
>
> Looks good to me.
>
> Acked-by: Dumitru Ceara 
>
> Thanks,
> Dumitru
>
Thanks for the review. I applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-controller.c: Avoid adding neighbour flows for non-local datapaths.

2020-02-20 Thread Han Zhou
On Thu, Feb 20, 2020 at 1:58 AM Dumitru Ceara  wrote:
>
> On 2/19/20 6:36 PM, Han Zhou wrote:
> >
> >
> > On Wed, Feb 19, 2020 at 4:50 AM Dumitru Ceara  > > wrote:
> >>
> >> On 2/19/20 12:32 AM, Han Zhou wrote:
> >> > This is usefule when external_ids:ovn-monitor-all is set to true.
> >> >
> >> > Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
> >>
> >> Hi Han,
> >>
> >> Looks good to me.
> >>
> >> Acked-by: Dumitru Ceara mailto:dce...@redhat.com>>
> >>
> >> I also tested this (together with your previous patch) on a scaled
setup
> >> with 150 ovn-fake-multinode nodes and ovn-monitor-all enabled.
> >>
> >> With OVN master I see high CPU usage on ovn-controllers from time to
time:
> >>
> >>
> >
ovn-netlab1-64/ovn-controller.log:2020-02-19T12:14:11.896Z|00017|timeval|WARN|Unreasonably
> >> long 1087ms poll interval (224ms user, 14ms system)
> >>
> >
ovn-netlab1-140/ovn-controller.log:2020-02-19T12:14:12.030Z|00017|timeval|WARN|Unreasonably
> >> long 1055ms poll interval (241ms user, 11ms system)
> >>
> >
ovn-netlab1-69/ovn-controller.log:2020-02-19T12:14:11.856Z|00017|timeval|WARN|Unreasonably
> >> long 1019ms poll interval (221ms user, 1ms system)
> >>
> >
ovn-netlab1-25/ovn-controller.log:2020-02-19T12:14:11.857Z|00017|timeval|WARN|Unreasonably
> >> long 1053ms poll interval (230ms user, 9ms system)
> >>
> >
ovn-netlab1-48/ovn-controller.log:2020-02-19T12:14:11.827Z|00017|timeval|WARN|Unreasonably
> >> long 1005ms poll interval (245ms user, 22ms system)
> >>
> >
ovn-netlab1-80/ovn-controller.log:2020-02-19T12:14:11.936Z|00017|timeval|WARN|Unreasonably
> >> long 1127ms poll interval (218ms user, 2ms system)
> >>
> >
ovn-netlab1-56/ovn-controller.log:2020-02-19T12:14:01.202Z|00017|timeval|WARN|Unreasonably
> >> long 1016ms poll interval (224ms user, 0ms system)
> >>
> >
ovn-netlab1-24/ovn-controller.log:2020-02-19T12:14:22.623Z|00017|timeval|WARN|Unreasonably
> >> long 1022ms poll interval (227ms user, 1ms system)
> >>
> >
ovn-netlab1-65/ovn-controller.log:2020-02-19T12:13:19.585Z|00017|timeval|WARN|Unreasonably
> >> long 1012ms poll interval (213ms user, 1ms system)
> >>
> >
ovn-netlab1-46/ovn-controller.log:2020-02-19T12:14:11.893Z|00017|timeval|WARN|Unreasonably
> >> long 1086ms poll interval (225ms user, 0ms system)
> >>
> >
ovn-netlab1-21/ovn-controller.log:2020-02-19T12:13:19.586Z|00017|timeval|WARN|Unreasonably
> >> long 1031ms poll interval (222ms user, 0ms system)
> >>
> >> With your changes this happens less often:
> >>
> >
./localhost/ovn-netlab1-63/ovn-controller.log:2020-02-19T12:46:10.204Z|00017|timeval|WARN|Unreasonably
> >> long 1038ms poll interval (223ms user, 1ms system)
> >>
> >
./localhost/ovn-netlab1-67/ovn-controller.log:2020-02-19T12:45:59.677Z|00017|timeval|WARN|Unreasonably
> >> long 1033ms poll interval (215ms user, 0ms system)
> >>
> >
./localhost/ovn-netlab1-96/ovn-controller.log:2020-02-19T12:46:10.261Z|00017|timeval|WARN|Unreasonably
> >> long 1009ms poll interval (219ms user, 1ms system)
> >>
> >
./localhost/ovn-netlab1-43/ovn-controller.log:2020-02-19T12:46:10.194Z|00017|timeval|WARN|Unreasonably
> >> long 1044ms poll interval (222ms user, 0ms system)
> >>
> >
./localhost/ovn-netlab1-58/ovn-controller.log:2020-02-19T12:46:10.253Z|00017|timeval|WARN|Unreasonably
> >> long 1091ms poll interval (225ms user, 12ms system)
> >>
> >
./localhost/ovn-netlab1-95/ovn-controller.log:2020-02-19T12:46:10.246Z|00017|timeval|WARN|Unreasonably
> >> long 1031ms poll interval (216ms user, 16ms system)
> >>
> >>
> >> Regards,
> >> Dumitru
> >>
> > Thanks Dumitru for reviewing and testing it out.
> > Are you seeing high CPU only after applying this patch? In theory I
> > think this patch should not contribute to CPU spike.
> > Enabling ovn-monitor-all can result in higher CPU in ovn-controller in
> > circumstances when not all datapaths are local. In your test case, is
> > the topology ideal for ovn-monitor-all? I.e. does each node cares about
> > all datapaths? If the answer is yes, then could you try enabling
> > ovn-monitor-all only on half of the nodes, and see if the nodes with
> > ovn-monitor-all enabled are with higher CPU than others?
> >
>
> Hi Han,
>
> In my test topology all datapaths are local (i.e., all logical switches
> are connected to a single cluster logical router).
>
> The test machine I used initially was oversubscribed so I ran the tests
> again on a setup with more physical machines:
>
> 1. With OVN master, ovn-monitor-all=false, bringing up 300 nodes (300
> logical switches + one VIF per switch):
> - SB DB CPU usage is high after a certain number of nodes come up.
> Running perf on the setup points to ovsdb_monitor_get_update that takes
> up to 70% CPU time (including children). This due to each ovn-controller
> subscribing to OVSDB updates for all datapaths individually.
> - ovn-controller CPU usage is normal, i.e., no visible CPU spikes.
>
> 2. With OVN master, ovn-monitor-all=true, bringing up 300 nodes:
> - SB DB CPU usage is low, no visible CP

Re: [ovs-dev] 答复: [PATCH v2] netdev-linux: Prepend the std packet in the TSO packet

2020-02-20 Thread William Tu
On Tue, Feb 4, 2020 at 9:12 AM Flavio Leitner  wrote:
>
> On Tue, Feb 04, 2020 at 12:00:19PM -0300, Flavio Leitner wrote:
> > On Tue, Feb 04, 2020 at 12:51:24AM +, Yi Yang (杨�D)-云服务集团 wrote:
> > > Hi, Flavio
> The code is in my github branch:
> https://github.com/fleitner/ovs/tree/tso-tap-enable-tx-v1
>
> commit 884371df3bf3df836d4c2ab2d62b420339691fe8
> Author: Flavio Leitner 
> Date:   Tue Feb 4 11:18:49 2020 -0300
>
> netdev-linux: Enable TSO in the TAP device.
>
> Use ioctl TUNSETOFFLOAD if kernel supports to enable TSO
> offloading in the tap device.
>
> Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> Signed-off-by: Flavio Leitner 
>
Hi Flavio,
Are you sending out this patch?
I've tested it and looks good, improve performance a lot.
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/2] Support for non-consecutive numa nodes and core ids.

2020-02-20 Thread David Christensen

Ovs-numa currently makes the assumption that numa node ids and cpu core ids
will be numbered consecutively. Current Power systems don't always follow this
model. Furthermore, cpus on Power may be on/off lined based the setting of
Simultaneous multithreading (SMT). The result can be gaps in the numbering of
the cores. For example, a 2 socket system with 20 Core(s) per socket configured
with 4 thread per core (smt=4) has the following topology:

NUMA node0 CPU(s):   0-79
NUMA node8 CPU(s):   80-159

When set to smt=2 the following topology is created.

NUMA node0 CPU(s): 0,1,4,5,8,9,12,13,16,17,20,21,24,25,28,29,32,33,36,37,40,
41,44,45,48,49,52,53,56,57,60,61,64,65,68,69,72,73,76,77
NUMA node8 CPU(s): 80,81,84,85,88,89,92,93,96,97,100,101,104,105,108,109,112,
113,116,117,120,121,124,125,128,129,132,133,136,137,140,
141,144,145,148,149,152,153,156,157

The first patch in this series allows ovs-numa to work with non-consecutive
node and cpu ids. I updated pmd and dpif-netdev multi-node tests to simulate
a numa topology with non-consecutive node ids.

The second patch in the series updates lib/dpdk:construct_dpdk_socket_mem()
to correctly build the EAL options: --socket-mem and --socket-limit on systems
with non-consecutive node ids.

v2 changes:
0-day Robot suggested changes.
v3 changes:
re-wrote cpu_detected() to address memory leak.



Tested-by: David Christensen 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v1] northd: Allow /64 after ipv6_prefix

2020-02-20 Thread Russell Bryant
On Thu, Feb 20, 2020 at 10:46 AM Numan Siddique  wrote:

> On Wed, Feb 19, 2020 at 9:27 PM Russell Bryant  wrote:
> >
> > We recently hit a bug in ovn-kubernetes, where I accidentally added
> > /64 at the end of ipv6_prefix, to match the format we used for the
> > subnet option for IPv4.  This was not allowed.
> >
> > This patch update ovn-northd to take the ipv6_prefix either with or
> > without a trailing "/64".  It still enforces a /64 CIDR prefix length.
> >
> > A test case was updated to ensure that a prefix with "/64" is now
> > accepted.
> >
> > Signed-off-by: Russell Bryant 
>
> With the below check patch warnings fixed
>
> Acked-by: Numan Siddique 
>
>
Thanks!  I fixed the line length issues and pushed to master.



> 
> WARNING: Line is 82 characters long (recommended limit is 79)
> #40 FILE: northd/ovn-northd.c:676:
> VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: %s",
> ipv6_prefix, error);
>
> WARNING: Line is 88 characters long (recommended limit is 79)
> #48 FILE: northd/ovn-northd.c:684:
> VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: must be
> /64", ipv6_prefix);
>
> 
>
> Thanks
> Numan
>
> > ---
> >  northd/ovn-northd.c | 31 +--
> >  tests/ovn.at|  4 +++-
> >  2 files changed, 32 insertions(+), 3 deletions(-)
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 2580b4ec9..59d085aa9 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -664,8 +664,35 @@ init_ipam_info_for_datapath(struct ovn_datapath *od)
> >  const char *ipv6_prefix = smap_get(&od->nbs->other_config,
> "ipv6_prefix");
> >
> >  if (ipv6_prefix) {
> > -od->ipam_info.ipv6_prefix_set = ipv6_parse(
> > -ipv6_prefix, &od->ipam_info.ipv6_prefix);
> > +if (strstr(ipv6_prefix, "/")) {
> > +/* If a prefix length was specified, it must be 64. */
> > +struct in6_addr mask;
> > +char *error
> > += ipv6_parse_masked(ipv6_prefix,
> > +&od->ipam_info.ipv6_prefix, &mask);
> > +if (error) {
> > +static struct vlog_rate_limit rl
> > += VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: %s",
> ipv6_prefix, error);
> > +free(error);
> > +} else {
> > +if (ipv6_count_cidr_bits(&mask) == 64) {
> > +od->ipam_info.ipv6_prefix_set = true;
> > +} else {
> > +static struct vlog_rate_limit rl
> > += VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: must be
> /64", ipv6_prefix);
> > +}
> > +}
> > +} else {
> > +od->ipam_info.ipv6_prefix_set = ipv6_parse(
> > +ipv6_prefix, &od->ipam_info.ipv6_prefix);
> > +if (!od->ipam_info.ipv6_prefix_set) {
> > +static struct vlog_rate_limit rl
> > += VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s", ipv6_prefix);
> > +}
> > +}
> >  }
> >
> >  if (!subnet_str) {
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 254645a3a..cbaa6d4a2 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -12289,8 +12289,10 @@ ovn-nbctl set Logical_Switch ls1 \
> >  other_config:subnet=10.1.0.0/24
> other_config:ipv6_prefix="2001:db8:1::"
> >  ovn-nbctl set Logical_Switch ls2 \
> >  other_config:subnet=10.2.0.0/24
> other_config:ipv6_prefix="2001:db8:2::"
> > +
> > +# A prefix length may be specified, but only if it is /64.
> >  ovn-nbctl set Logical_Switch ls3 \
> > -other_config:subnet=10.3.0.0/24
> other_config:ipv6_prefix="2001:db8:3::"
> > +other_config:subnet=10.3.0.0/24
> other_config:ipv6_prefix="2001:db8:3::/64"
> >
> >  ovn-nbctl lsp-add ls1 lp1
> >  ovn-nbctl lsp-add ls2 lp2
> > --
> > 2.24.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
>

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK

2020-02-20 Thread William Tu
On Thu, Feb 20, 2020 at 2:12 AM Yi Yang (杨燚)-云服务集团  wrote:
>
> Hi, Flavio
>
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here is my
> result. My kernel version is

Hi Yiyang,

I'm so confused with your description. Which case does not work for you?
Yifeng and Flavio were using OVS-DPDK with vhostuser to VM, is this
the case you're talking about?

>
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $
>
> $ ./run-iperf3.sh

Which case is this one?

> Connecting to host 10.15.1.3, port 5201
> [  4] local 10.15.1.2 port 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval   Transfer Bandwidth   Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval   Transfer Bandwidth   Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver
>
> Server output:
> Accepted connection from 10.15.1.2, port 56464
> [  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval   Transfer Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec
> [  5]  20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec
> [  5]  30.00-40.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  40.00-50.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
>
>
> iperf Done.
> $
>
> But it does work for tap, I'm not sure if it is a kernel issue, which kernel
  ^^^
So which case does not work?

> version are you using? I didn't use tpacket_v3 patch. Here is my local ovs
> info.

William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCHv2 1/2] userspace: Enable TSO support for non-DPDK.

2020-02-20 Thread William Tu
This patch enables TSO support for non-DPDK use cases, and
also add check-system-tso testsuite. Before TSO, we have to
disable checksum offload, allowing the kernel to calculate the
TCP/UDP packet checsum. With TSO, we can skip the checksum
validation by enabling checksum offload, and with large packet
size, we see better performance.

Consider container to container use cases:
  iperf3 -c (ns0) -> veth peer -> OVS -> veth peer -> iperf3 -s (ns1)
And I got around 6Gbps, similar to TSO with DPDK-enabled.

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/653109097
Signed-off-by: William Tu 
---
v2:
  - add make check-system-tso test
  - combine logging for dpdk and non-dpdk
  - I'm surprised that most of the test cases passed.
This is due to few tests using tcp/udp, so it does not trigger
TSO.  I saw only geneve/vxlan fails randomly, maybe we can
check it later.
---
 lib/dp-packet.h   | 95 ++-
 lib/userspace-tso.c   |  5 ---
 tests/.gitignore  |  3 ++
 tests/automake.mk | 15 +++
 tests/system-tso-macros.at| 42 +++
 tests/system-tso-testsuite.at | 26 
 6 files changed, 143 insertions(+), 43 deletions(-)
 create mode 100644 tests/system-tso-macros.at
 create mode 100644 tests/system-tso-testsuite.at

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 9f8991faad52..6b90cec2afb4 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -53,7 +53,25 @@ enum OVS_PACKED_ENUM dp_packet_source {
 enum dp_packet_offload_mask {
 DP_PACKET_OL_RSS_HASH_MASK  = 0x1, /* Is the 'rss_hash' valid? */
 DP_PACKET_OL_FLOW_MARK_MASK = 0x2, /* Is the 'flow_mark' valid? */
+DP_PACKET_OL_RX_L4_CKSUM_BAD = 1 << 3,
+DP_PACKET_OL_RX_IP_CKSUM_BAD = 1 << 4,
+DP_PACKET_OL_RX_L4_CKSUM_GOOD = 1 << 5,
+DP_PACKET_OL_RX_IP_CKSUM_GOOD = 1 << 6,
+DP_PACKET_OL_TX_TCP_SEG = 1 << 7,
+DP_PACKET_OL_TX_IPV4 = 1 << 8,
+DP_PACKET_OL_TX_IPV6 = 1 << 9,
+DP_PACKET_OL_TX_TCP_CKSUM = 1 << 10,
+DP_PACKET_OL_TX_UDP_CKSUM = 1 << 11,
+DP_PACKET_OL_TX_SCTP_CKSUM = 1 << 12,
 };
+
+#define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CKSUM | \
+ DP_PACKET_OL_TX_UDP_CKSUM | \
+ DP_PACKET_OL_TX_SCTP_CKSUM)
+#define DP_PACKET_OL_RX_IP_CKSUM_MASK (DP_PACKET_OL_RX_IP_CKSUM_GOOD | \
+   DP_PACKET_OL_RX_IP_CKSUM_BAD)
+#define DP_PACKET_OL_RX_L4_CKSUM_MASK (DP_PACKET_OL_RX_L4_CKSUM_GOOD | \
+   DP_PACKET_OL_RX_L4_CKSUM_BAD)
 #else
 /* DPDK mbuf ol_flags that are not really an offload flags.  These are mostly
  * related to mbuf memory layout and OVS should not touch/clear them. */
@@ -739,82 +757,79 @@ dp_packet_set_allocated(struct dp_packet *b, uint16_t s)
 b->allocated_ = s;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_is_tso(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_is_tso(const struct dp_packet *b)
 {
-return false;
+return !!(b->ol_flags & DP_PACKET_OL_TX_TCP_SEG);
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_is_ipv4(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_is_ipv4(const struct dp_packet *b)
 {
-return false;
+return !!(b->ol_flags & DP_PACKET_OL_TX_IPV4);
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline uint64_t
-dp_packet_hwol_l4_mask(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_mask(const struct dp_packet *b)
 {
-return 0;
+return b->ol_flags & DP_PACKET_OL_TX_L4_MASK;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_l4_is_tcp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_tcp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_TCP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_l4_is_udp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_udp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_UDP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_l4_is_sctp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_sctp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_SCTP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline void
-dp_packet_hwol_set_tx_ipv4(struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_set_tx_ipv4(struct dp_packet *b)
 {
+b->ol_flags |= DP_PACKET_OL_TX_IPV4;
 }
 
-/* There are no implementation when not DP

[ovs-dev] [PATCHv2 2/2] tests: Add tests using tap device.

2020-02-20 Thread William Tu
Similar to using veth across namespaces, this patch creates
tap devices, assigns to namespaces, and allows traffic to
go through different test cases.

Signed-off-by: William Tu 
---
 tests/atlocal.in  |  3 +++
 tests/automake.mk |  1 +
 tests/system-tap.at   | 34 ++
 tests/system-tso-testsuite.at |  1 +
 4 files changed, 39 insertions(+)
 create mode 100644 tests/system-tap.at

diff --git a/tests/atlocal.in b/tests/atlocal.in
index 1dc7cd5d087a..a49c5047a0a5 100644
--- a/tests/atlocal.in
+++ b/tests/atlocal.in
@@ -165,6 +165,9 @@ find_command()
 # Set HAVE_NC
 find_command nc
 
+# Set HAVE_TUNCTL
+find_command tunctl
+
 # Determine correct netcat option to quit on stdin EOF
 if nc --version 2>&1 | grep -q nmap.org; then
 # Nmap netcat
diff --git a/tests/automake.mk b/tests/automake.mk
index b8ddc069417e..b6100399f775 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -158,6 +158,7 @@ SYSTEM_USERSPACE_TESTSUITE_AT = \
 
 SYSTEM_TSO_TESTSUITE_AT = \
tests/system-tso-testsuite.at \
+   tests/system-tap.at \
tests/system-tso-macros.at
 
 SYSTEM_AFXDP_TESTSUITE_AT = \
diff --git a/tests/system-tap.at b/tests/system-tap.at
new file mode 100644
index ..3b9015a550bb
--- /dev/null
+++ b/tests/system-tap.at
@@ -0,0 +1,34 @@
+AT_SETUP([traffic between namespaces using tap])
+AT_KEYWORDS([http_tap])
+OVS_TRAFFIC_VSWITCHD_START()
+AT_SKIP_IF([test $HAVE_TUNCTL = no])
+
+AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+AT_CHECK([tunctl -t tap0 1> /dev/null])
+on_exit 'tunctl -d tap0'
+AT_CHECK([tunctl -t tap1 1> /dev/null])
+on_exit 'tunctl -d tap1'
+
+AT_CHECK([ovs-vsctl add-port br0 tap0 -- set int tap0 type=tap])
+AT_CHECK([ovs-vsctl add-port br0 tap1 -- set int tap1 type=tap])
+AT_CHECK([ip link set tap0 netns at_ns0])
+AT_CHECK([ip link set tap1 netns at_ns1])
+
+AT_CHECK([ip netns exec at_ns0 ip link set dev tap0 up])
+AT_CHECK([ip netns exec at_ns1 ip link set dev tap1 up])
+AT_CHECK([ip netns exec at_ns0 ip addr add 10.1.1.1/24 dev tap0])
+AT_CHECK([ip netns exec at_ns1 ip addr add 10.1.1.2/24 dev tap1])
+
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], 
[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+OVS_START_L7([at_ns1], [http])
+NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o 
wget0.log])
+
+OVS_TRAFFIC_VSWITCHD_STOP(["/.*ethtool command ETHTOOL_G.*/d"])
+
+AT_CLEANUP
diff --git a/tests/system-tso-testsuite.at b/tests/system-tso-testsuite.at
index 99d748006a86..594d1a6fde85 100644
--- a/tests/system-tso-testsuite.at
+++ b/tests/system-tso-testsuite.at
@@ -23,4 +23,5 @@ m4_include([tests/system-common-macros.at])
 m4_include([tests/system-userspace-macros.at])
 m4_include([tests/system-tso-macros.at])
 
+m4_include([tests/system-tap.at])
 m4_include([tests/system-traffic.at])
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] openvswitch: Distribute switch variables for initialization

2020-02-20 Thread David Miller
From: Kees Cook 
Date: Wed, 19 Feb 2020 22:23:09 -0800

> Variables declared in a switch statement before any case statements
> cannot be automatically initialized with compiler instrumentation (as
> they are not part of any execution flow). With GCC's proposed automatic
> stack variable initialization feature, this triggers a warning (and they
> don't get initialized). Clang's automatic stack variable initialization
> (via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
> doesn't initialize such variables[1]. Note that these warnings (or silent
> skipping) happen before the dead-store elimination optimization phase,
> so even when the automatic initializations are later elided in favor of
> direct initializations, the warnings remain.
> 
> To avoid these problems, move such variables into the "case" where
> they're used or lift them up into the main function body.
> 
> net/openvswitch/flow_netlink.c: In function ‘validate_set’:
> net/openvswitch/flow_netlink.c:2711:29: warning: statement will never be 
> executed [-Wswitch-unreachable]
>  2711 |  const struct ovs_key_ipv4 *ipv4_key;
>   | ^~~~
> 
> [1] https://bugs.llvm.org/show_bug.cgi?id=44916
> 
> Signed-off-by: Kees Cook 

Applied.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v1] northd: Allow /64 after ipv6_prefix

2020-02-20 Thread Numan Siddique
On Wed, Feb 19, 2020 at 9:27 PM Russell Bryant  wrote:
>
> We recently hit a bug in ovn-kubernetes, where I accidentally added
> /64 at the end of ipv6_prefix, to match the format we used for the
> subnet option for IPv4.  This was not allowed.
>
> This patch update ovn-northd to take the ipv6_prefix either with or
> without a trailing "/64".  It still enforces a /64 CIDR prefix length.
>
> A test case was updated to ensure that a prefix with "/64" is now
> accepted.
>
> Signed-off-by: Russell Bryant 

With the below check patch warnings fixed

Acked-by: Numan Siddique 


WARNING: Line is 82 characters long (recommended limit is 79)
#40 FILE: northd/ovn-northd.c:676:
VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: %s",
ipv6_prefix, error);

WARNING: Line is 88 characters long (recommended limit is 79)
#48 FILE: northd/ovn-northd.c:684:
VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: must be
/64", ipv6_prefix);



Thanks
Numan

> ---
>  northd/ovn-northd.c | 31 +--
>  tests/ovn.at|  4 +++-
>  2 files changed, 32 insertions(+), 3 deletions(-)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 2580b4ec9..59d085aa9 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -664,8 +664,35 @@ init_ipam_info_for_datapath(struct ovn_datapath *od)
>  const char *ipv6_prefix = smap_get(&od->nbs->other_config, 
> "ipv6_prefix");
>
>  if (ipv6_prefix) {
> -od->ipam_info.ipv6_prefix_set = ipv6_parse(
> -ipv6_prefix, &od->ipam_info.ipv6_prefix);
> +if (strstr(ipv6_prefix, "/")) {
> +/* If a prefix length was specified, it must be 64. */
> +struct in6_addr mask;
> +char *error
> += ipv6_parse_masked(ipv6_prefix,
> +&od->ipam_info.ipv6_prefix, &mask);
> +if (error) {
> +static struct vlog_rate_limit rl
> += VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: %s", ipv6_prefix, 
> error);
> +free(error);
> +} else {
> +if (ipv6_count_cidr_bits(&mask) == 64) {
> +od->ipam_info.ipv6_prefix_set = true;
> +} else {
> +static struct vlog_rate_limit rl
> += VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s: must be /64", 
> ipv6_prefix);
> +}
> +}
> +} else {
> +od->ipam_info.ipv6_prefix_set = ipv6_parse(
> +ipv6_prefix, &od->ipam_info.ipv6_prefix);
> +if (!od->ipam_info.ipv6_prefix_set) {
> +static struct vlog_rate_limit rl
> += VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_WARN_RL(&rl, "bad 'ipv6_prefix' %s", ipv6_prefix);
> +}
> +}
>  }
>
>  if (!subnet_str) {
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 254645a3a..cbaa6d4a2 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -12289,8 +12289,10 @@ ovn-nbctl set Logical_Switch ls1 \
>  other_config:subnet=10.1.0.0/24 other_config:ipv6_prefix="2001:db8:1::"
>  ovn-nbctl set Logical_Switch ls2 \
>  other_config:subnet=10.2.0.0/24 other_config:ipv6_prefix="2001:db8:2::"
> +
> +# A prefix length may be specified, but only if it is /64.
>  ovn-nbctl set Logical_Switch ls3 \
> -other_config:subnet=10.3.0.0/24 other_config:ipv6_prefix="2001:db8:3::"
> +other_config:subnet=10.3.0.0/24 
> other_config:ipv6_prefix="2001:db8:3::/64"
>
>  ovn-nbctl lsp-add ls1 lp1
>  ovn-nbctl lsp-add ls2 lp2
> --
> 2.24.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Check for NULL before accessing ovsrec_open_vswitch row.

2020-02-20 Thread Numan Siddique
On Thu, Feb 20, 2020 at 8:25 PM Dumitru Ceara  wrote:
>
> On 2/20/20 3:43 PM, num...@ovn.org wrote:
> > From: Numan Siddique 
> >
> > There are occasional crashes seen with OpenShift CI
> >
> > 
> > gdb) bt
> > 0  hmap_first_with_hash (hmap=0x128, hmap=0x128, hash=2563384147) at 
> > include/openvswitch/hmap.h:328
> > 1  smap_find__ (smap=0x128, key=key@entry=0x558de1ab278f 
> > "ovn-openflow-probe-interval", key_len=27, hash=2563384147) at 
> > lib/smap.c:402
> > 2  0x558de19f2596 in smap_get_node (smap=, 
> > key=key@entry=0x558de1ab278f "ovn-openflow-probe-interval") at 
> > lib/smap.c:217
> > 3  0x558de19f26cc in smap_get_def (def=0x0, key=0x558de1ab278f 
> > "ovn-openflow-probe-interval", smap=) at lib/smap.c:208
> > 4  smap_get (key=0x558de1ab278f "ovn-openflow-probe-interval", 
> > smap=) at lib/smap.c:200
> > 5  smap_get_int (smap=, key=key@entry=0x558de1ab278f 
> > "ovn-openflow-probe-interval", def=def@entry=5) at lib/smap.c:240
> > 6  0x558de192f1ac in get_ofctrl_probe_interval (ovs_idl= > out>) at controller/ovn-controller.c:439
> > 7  main (argc=12, argv=0x7ffcfaef1e28) at controller/ovn-controller.c:1917
> >
> > *
> >
> > This patch fixes it.
> >
> > Reported-by: Alexander Constantinescu 
> > Signed-off-by: Numan Siddique 
> > ---
> > v1 -> v2
> > ---
> >   * Fixed the incomplete commit message
> >
> >  controller/ovn-controller.c | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 4d245ca28..386c9f006 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -436,9 +436,10 @@ static int
> >  get_ofctrl_probe_interval(struct ovsdb_idl *ovs_idl)
> >  {
> >  const struct ovsrec_open_vswitch *cfg = 
> > ovsrec_open_vswitch_first(ovs_idl);
> > -return smap_get_int(&cfg->external_ids,
> > -"ovn-openflow-probe-interval",
> > -OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
> > +return !cfg ? OFCTRL_DEFAULT_PROBE_INTERVAL_SEC :
> > +  smap_get_int(&cfg->external_ids,
> > +   "ovn-openflow-probe-interval",
> > +   OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
> >  }
> >
> >  /* Retrieves the pointer to the OVN Southbound database from 'ovs_idl' and
> >
>
> Acked-by: Dumitru Ceara 

Thanks Dumitru. I applied this patch to master and branch-20.03

Numan

>
> Thanks,
> Dumitru
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Check for NULL before accessing ovsrec_open_vswitch row.

2020-02-20 Thread Dumitru Ceara
On 2/20/20 3:43 PM, num...@ovn.org wrote:
> From: Numan Siddique 
> 
> There are occasional crashes seen with OpenShift CI
> 
> 
> gdb) bt
> 0  hmap_first_with_hash (hmap=0x128, hmap=0x128, hash=2563384147) at 
> include/openvswitch/hmap.h:328
> 1  smap_find__ (smap=0x128, key=key@entry=0x558de1ab278f 
> "ovn-openflow-probe-interval", key_len=27, hash=2563384147) at lib/smap.c:402
> 2  0x558de19f2596 in smap_get_node (smap=, 
> key=key@entry=0x558de1ab278f "ovn-openflow-probe-interval") at lib/smap.c:217
> 3  0x558de19f26cc in smap_get_def (def=0x0, key=0x558de1ab278f 
> "ovn-openflow-probe-interval", smap=) at lib/smap.c:208
> 4  smap_get (key=0x558de1ab278f "ovn-openflow-probe-interval", 
> smap=) at lib/smap.c:200
> 5  smap_get_int (smap=, key=key@entry=0x558de1ab278f 
> "ovn-openflow-probe-interval", def=def@entry=5) at lib/smap.c:240
> 6  0x558de192f1ac in get_ofctrl_probe_interval (ovs_idl=) 
> at controller/ovn-controller.c:439
> 7  main (argc=12, argv=0x7ffcfaef1e28) at controller/ovn-controller.c:1917
> 
> *
> 
> This patch fixes it.
> 
> Reported-by: Alexander Constantinescu 
> Signed-off-by: Numan Siddique 
> ---
> v1 -> v2
> ---
>   * Fixed the incomplete commit message
> 
>  controller/ovn-controller.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 4d245ca28..386c9f006 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -436,9 +436,10 @@ static int
>  get_ofctrl_probe_interval(struct ovsdb_idl *ovs_idl)
>  {
>  const struct ovsrec_open_vswitch *cfg = 
> ovsrec_open_vswitch_first(ovs_idl);
> -return smap_get_int(&cfg->external_ids,
> -"ovn-openflow-probe-interval",
> -OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
> +return !cfg ? OFCTRL_DEFAULT_PROBE_INTERVAL_SEC :
> +  smap_get_int(&cfg->external_ids,
> +   "ovn-openflow-probe-interval",
> +   OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
>  }
>  
>  /* Retrieves the pointer to the OVN Southbound database from 'ovs_idl' and
> 

Acked-by: Dumitru Ceara 

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] ovn-controller: Check for NULL before accessing ovsrec_open_vswitch row.

2020-02-20 Thread numans
From: Numan Siddique 

There are occasional crashes seen with OpenShift CI


gdb) bt
0  hmap_first_with_hash (hmap=0x128, hmap=0x128, hash=2563384147) at 
include/openvswitch/hmap.h:328
1  smap_find__ (smap=0x128, key=key@entry=0x558de1ab278f 
"ovn-openflow-probe-interval", key_len=27, hash=2563384147) at lib/smap.c:402
2  0x558de19f2596 in smap_get_node (smap=, 
key=key@entry=0x558de1ab278f "ovn-openflow-probe-interval") at lib/smap.c:217
3  0x558de19f26cc in smap_get_def (def=0x0, key=0x558de1ab278f 
"ovn-openflow-probe-interval", smap=) at lib/smap.c:208
4  smap_get (key=0x558de1ab278f "ovn-openflow-probe-interval", smap=) at lib/smap.c:200
5  smap_get_int (smap=, key=key@entry=0x558de1ab278f 
"ovn-openflow-probe-interval", def=def@entry=5) at lib/smap.c:240
6  0x558de192f1ac in get_ofctrl_probe_interval (ovs_idl=) at 
controller/ovn-controller.c:439
7  main (argc=12, argv=0x7ffcfaef1e28) at controller/ovn-controller.c:1917

*

This patch fixes it.

Reported-by: Alexander Constantinescu 
Signed-off-by: Numan Siddique 
---
v1 -> v2
---
  * Fixed the incomplete commit message

 controller/ovn-controller.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 4d245ca28..386c9f006 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -436,9 +436,10 @@ static int
 get_ofctrl_probe_interval(struct ovsdb_idl *ovs_idl)
 {
 const struct ovsrec_open_vswitch *cfg = ovsrec_open_vswitch_first(ovs_idl);
-return smap_get_int(&cfg->external_ids,
-"ovn-openflow-probe-interval",
-OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
+return !cfg ? OFCTRL_DEFAULT_PROBE_INTERVAL_SEC :
+  smap_get_int(&cfg->external_ids,
+   "ovn-openflow-probe-interval",
+   OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
 }
 
 /* Retrieves the pointer to the OVN Southbound database from 'ovs_idl' and
-- 
2.24.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] ovn-controller: Check for NULL before accessing ovsrec_open_vswitch row.

2020-02-20 Thread numans
From: Numan Siddique 

There are occasional crashes seen with OpenShift CI


gdb) bt
*

This patch fixes it.

Reported-by: Alexander Constantinescu 
Signed-off-by: Numan Siddique 
---
 controller/ovn-controller.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 4d245ca28..386c9f006 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -436,9 +436,10 @@ static int
 get_ofctrl_probe_interval(struct ovsdb_idl *ovs_idl)
 {
 const struct ovsrec_open_vswitch *cfg = ovsrec_open_vswitch_first(ovs_idl);
-return smap_get_int(&cfg->external_ids,
-"ovn-openflow-probe-interval",
-OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
+return !cfg ? OFCTRL_DEFAULT_PROBE_INTERVAL_SEC :
+  smap_get_int(&cfg->external_ids,
+   "ovn-openflow-probe-interval",
+   OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
 }
 
 /* Retrieves the pointer to the OVN Southbound database from 'ovs_idl' and
-- 
2.24.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK

2020-02-20 Thread Flavio Leitner
On Thu, Feb 20, 2020 at 10:10:36AM +, Yi Yang (杨�D)-云服务集团 wrote:
> Hi, Flavio
> 
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here is my
> result. My kernel version is 
> 
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $

I tested with 4.15.0 upstream and it worked. Can you do the same?

> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201
> [  4] local 10.15.1.2 port 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval   Transfer Bandwidth   Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval   Transfer Bandwidth   Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver

That looks like TSO packets are being dropped and the traffic is
basically TCP retransmissions of MTU size.

fbl


> 
> Server output:
> Accepted connection from 10.15.1.2, port 56464
> [  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval   Transfer Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec
> [  5]  20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec
> [  5]  30.00-40.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  40.00-50.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> 
> 
> iperf Done.
> $
> 
> But it does work for tap, I'm not sure if it is a kernel issue, which kernel
> version are you using? I didn't use tpacket_v3 patch. Here is my local ovs
> info.
> 
> $ git log
> commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> Author: Ilya Maximets 
> Date:   Thu Feb 6 14:24:23 2020 +0100
> 
> netdev-dpdk: Don't enable offloading on HW device if not requested.
> 
> DPDK drivers has different implementations of transmit functions.
> Enabled offloading may cause driver to choose slower variant
> significantly affecting performance if userspace TSO wasn't requested.
> 
> Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> Reported-by: David Marchand 
> Acked-by: David Marchand 
> Acked-by: Flavio Leitner 
> Acked-by: Kevin Traynor 
> Signed-off-by: Ilya Maximets 
> 
> commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> Author: Flavio Leitner 
> Date:   Mon Feb 3 18:45:50 2020 -0300
> 
> netdev-linux: Prepend the std packet in the TSO packet
> 
> Usually TSO packets are close to 50k, 60k bytes long, so to
> to copy less bytes when receiving a packet from the kernel
> change the approach. Instead of extending the MTU sized
> packet received and append with remaining TSO data from
> the TSO buffer, allocate a TSO packet with enough headroom
> to prepend the std packet data.
> 
> Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> Suggested-by: Ben Pfaff 
> Signed-off-by: Flavio Leitner 
> Signed-off-by: Ben Pfaff 
> 
> commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> Author: Flavio Leitner 
> Date:   Mon Feb 3 11:22:22 2020 -0300
> 
> netdev-linux-private: fix max length to be 16 bits
> 
> The dp_packet length is limited to 16 bits, so document that
> and fix the length value accordingly.
> 
> Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> Signed-off-by: Flavio Leitner 
> Signed-off-by: Ben Pfaff 
> 
> commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> Author: David Marchand 
> Date:   Tue Feb 4 22:28:26 2020 +0100
> 
> netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> 
> The check on TSO capability did not ensure ip checksum, tcp checksum and
> TSO tx offloads were available which resulted in a port init failure
> (example below with a ena device):
> 
> *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
> offloads 0x2a doesn't match Tx offloads capabilities 0xe in
> rte_eth_dev_configure()*
> 
> Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> 
> Reported-by: Ravi Kerur 
> Signed-off-by: David Marchand 
> Acked-by: Kevin Traynor 
> Acked-by: Flavio Leitner 
> Signed-off-by: Ilya Maximets 
> 
> commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> Author: Kirill A. Kornilov 
> Date:   Mon Jan 13 12:29:10 2020 +0300
> 
> vswitchd: Add serial number configuration.
> 
> Signed-off-by: Kirill A. Kornilov 
>

Re: [ovs-dev] [PATCH] conntrack: Fix conntrack new state

2020-02-20 Thread Roni Bar Yanai
Hi Yi-Hung,

>From the man page, it looks like the +est is the result of committing action 
>and not related to the TCP state machine itself.
I'm aware that this is not how it works in kernel conntrack, if you want to 
align both , the description might need an update.

new (0x01)
 A new connection. Set to 1 if this is an uncommitted con‐
 nection.

  est (0x02)
 Part of an existing connection. Set to 1  if  this  is  a
 committed connection.

>-Original Message-
>From: dev  On Behalf Of Dumitru Ceara
>Sent: Tuesday, February 4, 2020 10:06 AM
>To: Yi-Hung Wei 
>Cc: ovs dev 
>Subject: Re: [ovs-dev] [PATCH] conntrack: Fix conntrack new state
>
>On 2/4/20 12:48 AM, Yi-Hung Wei wrote:
>> On Fri, Jan 31, 2020 at 2:07 AM Dumitru Ceara  wrote:
>>>
>>> On 1/29/20 7:58 PM, William Tu wrote:
 On Fri, Dec 20, 2019 at 01:16:39PM -0800, Ben Pfaff wrote:
> On Fri, Dec 20, 2019 at 09:51:08AM -0800, Yi-Hung Wei wrote:
>> In connection tracking system, a connection is established if we
>> see packets from both directions.  However, in userspace
>> datapath's conntrack, if we send a connection setup packet in one
>> direction twice, it will make the connection to be in established state.
>>
>> This patch fixes the aforementioned issue, and adds a system
>> traffic test for UDP and TCP traffic to avoid regression.
>>
>> Fixes: a489b16854b59 ("conntrack: New userspace connection
>> tracker.")
>> Signed-off-by: Yi-Hung Wei 
>> ---

 LGTM. I applied this to master, and branch 2.13.
 Thanks
 William

>>>
>>> Hi William, Yi-Hung,
>>>
>>> This patch breaks OVN switch load balancer system tests when running
>>> with the OVS userspace datapath. I only had a glance at the code and
>>> didn't dig into why exactly the tests fail yet but it's quite easy to
>>> reproduce:
>>>
>>> $ cd /tmp/
>>> $ git clone https://github.com/openvswitch/ovs
>>> $ cd /tmp/ovs
>>> $ git checkout a867c010ee9183885ee9d3eb76a0005c075c4d2e
>>> $ ./boot.sh && ./configure && make -j8 $ cd /tmp/ $ git clone
>>> https://github.com/ovn-org/ovn $ cd /tmp/ovn/ $ ./boot.sh &&
>>> ./configure --with-ovs-source=/tmp/ovs --with-ovs-build=/tmp/ovs &&
>>> make -j8 $ make -j8 check-system-userspace TESTSUITEFLAGS="-k ovnlb"
>>> [...]
>>> system-ovn
>>>
>>>   7: ovn -- load-balancing   FAILED
>>> (system-ovn.at:1121)
>>>   8: ovn -- load-balancing - IPv6FAILED
>>> (system-ovn.at:1268)
>>>   9: ovn -- load-balancing - same subnet.FAILED
>>> (system-ovn.at:1389)
>>>  10: ovn -- load-balancing - same subnet. - IPv6 FAILED
>>> (system-ovn.at:1498)
>>>  11: ovn -- load balancing in gateway router ok
>>>  12: ovn -- load balancing in gateway router - IPv6  ok
>>>  13: ovn -- multiple gateway routers, load-balancing ok
>>>  14: ovn -- multiple gateway routers, load-balancing - IPv6 ok
>>>  15: ovn -- load balancing in router with gateway router port ok
>>>  16: ovn -- load balancing in router with gateway router port - IPv6
>>> ok
>>>
>>> # Reverting commit a867c010ee9183885ee9d3eb76a0005c075c4d2e
>>>
>>> $ cd /tmp/ovs
>>> $ git checkout a867c010ee9183885ee9d3eb76a0005c075c4d2e~
>>> $ make -j8
>>> $ cd /tmp/ovn
>>> $ make -j8 check-system-userspace TESTSUITEFLAGS="-k ovnlb"
>>> [...]
>>> system-ovn
>>>
>>>   7: ovn -- load-balancing   ok
>>>   8: ovn -- load-balancing - IPv6ok
>>>   9: ovn -- load-balancing - same subnet.ok
>>>  10: ovn -- load-balancing - same subnet. - IPv6 ok
>>>  11: ovn -- load balancing in gateway router ok
>>>  12: ovn -- load balancing in gateway router - IPv6  ok
>>>  13: ovn -- multiple gateway routers, load-balancing ok
>>>  14: ovn -- multiple gateway routers, load-balancing - IPv6 ok
>>>  15: ovn -- load balancing in router with gateway router port ok
>>>  16: ovn -- load balancing in router with gateway router port - IPv6
>>> ok
>>>
>>> Thanks,
>>> Dumitru
>>
>>
>> Hi Dumitru,
>>
>> Thanks for reporting this issue. I am not familiar with OVN and the
>> broken system tests, but I will take a look.
>>
>> -Yi-Hung
>>
>
>Hi Yi-Hung,
>
>Please let me know if you need help. I can try to find a simpler way to 
>reproduce
>the issue.
>
>Regards,
>Dumitru
>
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] NRL PROGRAM

2020-02-20 Thread Capital Fund Corporation Limited




Good Day,

Capital Fund Corporation Limited are currently offering 95% Non-Recourse
Loan (NRL) with our BG Leased Monetization program for trade
finance,constructions, credit enhancement, government funding, property
investment and all-round range of funding, with 2% interest rate per annul
for a period of 10-years + 1 year grace period with no pre-payment
penalties. .


   TERMS: 2% INTEREST RATE YEARLY
   10 YEARS DURATION
   1M-50B USD / EUR REQUIREMENT:

 1. CLIENT MUST OWN OR REPRESENT A COMPANY.

 2. CLIENT MUST BE ABLE TO PROVIDE BUSINESS PLAN OR EXECUTIVE SUMMARY
OR SUMMARY PURPOSE OF THE LOAN..

Let us know if you are interested in our services so as to provide you
with more information.

Regards,

Maria Rosario Natividad
Financial Consultant.
Capital Fund Corporation Limited
Unit 1506, 10/FTower 2,
Cheung Sha Wan Plaza863 Cheung Sha Wan Road,
Lai Chi Kok Hong Kong

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK

2020-02-20 Thread 杨�D
Hi, Flavio

I find this tso feature doesn't work normally on my Ubuntu 16.04, here is my
result. My kernel version is 

$ uname -a
Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC
2019 x86_64 x86_64 x86_64 GNU/Linux
$

$ ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 56466 connected to 10.15.1.3 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
[  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
[  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
[  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
[  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
[  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
sender
[  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
receiver

Server output:
Accepted connection from 10.15.1.2, port 56464
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56466
[ ID] Interval   Transfer Bandwidth
[  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
[  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec
[  5]  20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec
[  5]  30.00-40.00  sec  7.79 MBytes  6.53 Mbits/sec
[  5]  40.00-50.00  sec  7.79 MBytes  6.53 Mbits/sec
[  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec


iperf Done.
$

But it does work for tap, I'm not sure if it is a kernel issue, which kernel
version are you using? I didn't use tpacket_v3 patch. Here is my local ovs
info.

$ git log
commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
Author: Ilya Maximets 
Date:   Thu Feb 6 14:24:23 2020 +0100

netdev-dpdk: Don't enable offloading on HW device if not requested.

DPDK drivers has different implementations of transmit functions.
Enabled offloading may cause driver to choose slower variant
significantly affecting performance if userspace TSO wasn't requested.

Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
Reported-by: David Marchand 
Acked-by: David Marchand 
Acked-by: Flavio Leitner 
Acked-by: Kevin Traynor 
Signed-off-by: Ilya Maximets 

commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
Author: Flavio Leitner 
Date:   Mon Feb 3 18:45:50 2020 -0300

netdev-linux: Prepend the std packet in the TSO packet

Usually TSO packets are close to 50k, 60k bytes long, so to
to copy less bytes when receiving a packet from the kernel
change the approach. Instead of extending the MTU sized
packet received and append with remaining TSO data from
the TSO buffer, allocate a TSO packet with enough headroom
to prepend the std packet data.

Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
Suggested-by: Ben Pfaff 
Signed-off-by: Flavio Leitner 
Signed-off-by: Ben Pfaff 

commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
Author: Flavio Leitner 
Date:   Mon Feb 3 11:22:22 2020 -0300

netdev-linux-private: fix max length to be 16 bits

The dp_packet length is limited to 16 bits, so document that
and fix the length value accordingly.

Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
Signed-off-by: Flavio Leitner 
Signed-off-by: Ben Pfaff 

commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
Author: David Marchand 
Date:   Tue Feb 4 22:28:26 2020 +0100

netdev-dpdk: Fix port init when lacking Tx offloads for TSO.

The check on TSO capability did not ensure ip checksum, tcp checksum and
TSO tx offloads were available which resulted in a port init failure
(example below with a ena device):

*2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
offloads 0x2a doesn't match Tx offloads capabilities 0xe in
rte_eth_dev_configure()*

Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")

Reported-by: Ravi Kerur 
Signed-off-by: David Marchand 
Acked-by: Kevin Traynor 
Acked-by: Flavio Leitner 
Signed-off-by: Ilya Maximets 

commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
Author: Kirill A. Kornilov 
Date:   Mon Jan 13 12:29:10 2020 +0300

vswitchd: Add serial number configuration.

Signed-off-by: Kirill A. Kornilov 
Signed-off-by: Ben Pfaff 

I applied your tap patch.

$ git diff
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index c6f3d27..74a5728 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
 goto error_close;
 }

+if (userspace_tso_enabled()) {
+/* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
+ * available, it will return EINVAL when a flag is unknown.
+ 

Re: [ovs-dev] [PATCH ovn] ovn-controller.c: Avoid adding neighbour flows for non-local datapaths.

2020-02-20 Thread Dumitru Ceara
On 2/19/20 6:36 PM, Han Zhou wrote:
> 
> 
> On Wed, Feb 19, 2020 at 4:50 AM Dumitru Ceara  > wrote:
>>
>> On 2/19/20 12:32 AM, Han Zhou wrote:
>> > This is usefule when external_ids:ovn-monitor-all is set to true.
>> >
>> > Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
>>
>> Hi Han,
>>
>> Looks good to me.
>>
>> Acked-by: Dumitru Ceara mailto:dce...@redhat.com>>
>>
>> I also tested this (together with your previous patch) on a scaled setup
>> with 150 ovn-fake-multinode nodes and ovn-monitor-all enabled.
>>
>> With OVN master I see high CPU usage on ovn-controllers from time to time:
>>
>>
> ovn-netlab1-64/ovn-controller.log:2020-02-19T12:14:11.896Z|00017|timeval|WARN|Unreasonably
>> long 1087ms poll interval (224ms user, 14ms system)
>>
> ovn-netlab1-140/ovn-controller.log:2020-02-19T12:14:12.030Z|00017|timeval|WARN|Unreasonably
>> long 1055ms poll interval (241ms user, 11ms system)
>>
> ovn-netlab1-69/ovn-controller.log:2020-02-19T12:14:11.856Z|00017|timeval|WARN|Unreasonably
>> long 1019ms poll interval (221ms user, 1ms system)
>>
> ovn-netlab1-25/ovn-controller.log:2020-02-19T12:14:11.857Z|00017|timeval|WARN|Unreasonably
>> long 1053ms poll interval (230ms user, 9ms system)
>>
> ovn-netlab1-48/ovn-controller.log:2020-02-19T12:14:11.827Z|00017|timeval|WARN|Unreasonably
>> long 1005ms poll interval (245ms user, 22ms system)
>>
> ovn-netlab1-80/ovn-controller.log:2020-02-19T12:14:11.936Z|00017|timeval|WARN|Unreasonably
>> long 1127ms poll interval (218ms user, 2ms system)
>>
> ovn-netlab1-56/ovn-controller.log:2020-02-19T12:14:01.202Z|00017|timeval|WARN|Unreasonably
>> long 1016ms poll interval (224ms user, 0ms system)
>>
> ovn-netlab1-24/ovn-controller.log:2020-02-19T12:14:22.623Z|00017|timeval|WARN|Unreasonably
>> long 1022ms poll interval (227ms user, 1ms system)
>>
> ovn-netlab1-65/ovn-controller.log:2020-02-19T12:13:19.585Z|00017|timeval|WARN|Unreasonably
>> long 1012ms poll interval (213ms user, 1ms system)
>>
> ovn-netlab1-46/ovn-controller.log:2020-02-19T12:14:11.893Z|00017|timeval|WARN|Unreasonably
>> long 1086ms poll interval (225ms user, 0ms system)
>>
> ovn-netlab1-21/ovn-controller.log:2020-02-19T12:13:19.586Z|00017|timeval|WARN|Unreasonably
>> long 1031ms poll interval (222ms user, 0ms system)
>>
>> With your changes this happens less often:
>>
> ./localhost/ovn-netlab1-63/ovn-controller.log:2020-02-19T12:46:10.204Z|00017|timeval|WARN|Unreasonably
>> long 1038ms poll interval (223ms user, 1ms system)
>>
> ./localhost/ovn-netlab1-67/ovn-controller.log:2020-02-19T12:45:59.677Z|00017|timeval|WARN|Unreasonably
>> long 1033ms poll interval (215ms user, 0ms system)
>>
> ./localhost/ovn-netlab1-96/ovn-controller.log:2020-02-19T12:46:10.261Z|00017|timeval|WARN|Unreasonably
>> long 1009ms poll interval (219ms user, 1ms system)
>>
> ./localhost/ovn-netlab1-43/ovn-controller.log:2020-02-19T12:46:10.194Z|00017|timeval|WARN|Unreasonably
>> long 1044ms poll interval (222ms user, 0ms system)
>>
> ./localhost/ovn-netlab1-58/ovn-controller.log:2020-02-19T12:46:10.253Z|00017|timeval|WARN|Unreasonably
>> long 1091ms poll interval (225ms user, 12ms system)
>>
> ./localhost/ovn-netlab1-95/ovn-controller.log:2020-02-19T12:46:10.246Z|00017|timeval|WARN|Unreasonably
>> long 1031ms poll interval (216ms user, 16ms system)
>>
>>
>> Regards,
>> Dumitru
>>
> Thanks Dumitru for reviewing and testing it out.
> Are you seeing high CPU only after applying this patch? In theory I
> think this patch should not contribute to CPU spike.
> Enabling ovn-monitor-all can result in higher CPU in ovn-controller in
> circumstances when not all datapaths are local. In your test case, is
> the topology ideal for ovn-monitor-all? I.e. does each node cares about
> all datapaths? If the answer is yes, then could you try enabling
> ovn-monitor-all only on half of the nodes, and see if the nodes with
> ovn-monitor-all enabled are with higher CPU than others?
> 

Hi Han,

In my test topology all datapaths are local (i.e., all logical switches
are connected to a single cluster logical router).

The test machine I used initially was oversubscribed so I ran the tests
again on a setup with more physical machines:

1. With OVN master, ovn-monitor-all=false, bringing up 300 nodes (300
logical switches + one VIF per switch):
- SB DB CPU usage is high after a certain number of nodes come up.
Running perf on the setup points to ovsdb_monitor_get_update that takes
up to 70% CPU time (including children). This due to each ovn-controller
subscribing to OVSDB updates for all datapaths individually.
- ovn-controller CPU usage is normal, i.e., no visible CPU spikes.

2. With OVN master, ovn-monitor-all=true, bringing up 300 nodes:
- SB DB CPU usage is low, no visible CPU spikes.
- ovn-controller CPU usage is normal as well.

3. With OVN master + your patches, ovn-monitor-all=true, bringing up 300
nodes:
- SB DB CPU usage is low, no visible CPU spikes.
- ovn-controller CPU usage is normal as well.

In conclusion all seems fine to m

Re: [ovs-dev] [PATCH ovn] [RFC] Add chassis liveness monitoring mechanism

2020-02-20 Thread Lucas Alvares Gomes
Thanks for the review Dumitry!

On Thu, Feb 20, 2020 at 9:19 AM Dumitru Ceara  wrote:
>
> On 2/19/20 4:37 PM, lmart...@redhat.com wrote:
> > From: Lucas Alvares Gomes 
> >
> > NOTE: SENDING THIS PATCH AS A RFC TO SEE WHAT OTHERS MIGHT THINK ABOUT
> > THE IDEA PRIOR TO WRITTING TESTS TO IT.
> >
> > CMSes integrating with OVN often uses the nb_cfg mechanism as a way to
> > check the health status of the ovn-controller process but, the current
> > implementation isn't ideal for that purpose because it floods the control
> > plane with update notifications every time the nb_cfg value is
> > incremented.
> >
> > This patch is merging two ideas:
> >
> > 1) Han Zhou proposed a patch [0] creating a table called Chassis_Private
> >where each hypervisor *only* writes and monitors its own record to
> >avoid this flooding problem.
> >
> > 2) Having ovn-controller to periodically log that it's alive instead of
> >relying on the nb_cfg mechanism.
> >
> > By using this mechanism, a CMS can more easily just read this new
> > Chassis_Private table and figure out the status of each Chassis
> > (ovn-controller) in the cluster.
> >
>
> Hi Lucas,
>
> Thanks a lot for working on this!
>
> > Here's some reasons why I believe this approach is better than having
> > to bump the  nb_cfg value:
> >
> > 1) Simple integration. Before, the CMS had to increment the nb_cfg
> >value in the NB_Global table and wait for it to be propagated to the
> >Southbound database (ovn-northd and then ovn-controllers) Chassis
> >table in order to know the status of the Chassis.
> >
> >Now, it can just read the Chassis_Private table and compare the
> >alive_at column against the current time.
>
> I guess the only comment I have about this approach is that it doesn't
> feel completely OK to store a timestamp representation as string in
> "alive_at". I'm afraid this might be too inflexible and force CMSes to
> compare their local time with the value in this column.
>
> I know we discussed offline that using a counter that is periodically
> incremented by ovn-controller instead of a timestamp would complicate
> the implementation in Openstack but maybe other people on this mailing
> list have ideas on how to deal with this in a more generic way.
>

Agreed, let's see if people have more ideas here.

Also, let me expand my explanation a little. The reason why I think
having a counter is not ideal is because services would need to track
this number to know what the value was before and then compare with
the current value to figure out whether it's being incremented or not.

In OpenStack, we have multiple API workers spread across the
controllers nodes and the API request to check for the services'
health status will land on any of those workers. Which means that I
can't keep the track of the counter in-memory. Mostly likely, I would
need to set an IDL event on the field being incremented and store the
status of those chassis somewhere else accessible by all API workers.

The advantage with the nb_cfg mechanism compared with the incremenal
counter is that those values are already in the OVSDB, so we can
compare what's in the NB DB with the SB DB to figure whether the
services are alive or not. But, the price is that we need more code to
deal with waiting for the synchronization of the OVSDBs and, if we
move it to the Chasiss_Private table it won't be backwards compatible.

Therefore I think a timestamp would be better. It's easy to understand
by either a service or a even a person. When u look at the alive_at
field u know the last time the service signilized it was alive. For
CMSes, the service can just read the Chassis_Private table and compare
the alive_at field with the current time to figure the status of the
service (right now it's UTC, but, we can change it to include the TZ
info if people think it's easier). No writes, sync issues and also
it's backwards compatible with the nb_cfg approach.

> >
> > 2) Less costy. Just one read from the db is needed, no writing. Code
> >using the nb_cfg mechanism had to implement a few safe-guard code to
> >make less error prone. See [1] and [2] for example.
> >
> > 3) Backwards compatibility. The patch [0] was moving the nb_cfg value
> >to new table so, systems relying on it would need to update their
> >code upon updating OVN.
>
> Nice!
>
> One more minor comment on the code, inline.
>
> Thanks,
> Dumitru
>
> >
> > To enable this new mechanism set an option called
> > chassis_liveness_interval=N to the "options" column in the NB_Global
> > table, where N is the time (in seconds) in which the ovn-controller
> > should updates the alive_at column. If not set or set to 0, the
> > mechanism is disabled. By default, it's disabled.
> >
> > [0] https://patchwork.ozlabs.org/patch/899608/
> > [1] https://review.opendev.org/#/c/703612/
> > [2] https://review.opendev.org/#/c/704530/
> >
> > Co-authored-by: Han Zhou 
> > Co-authored-by: Numan Siddique 
> > Co-authored-by: Dumitru Ce

Re: [ovs-dev] [PATCH ovn] [RFC] Add chassis liveness monitoring mechanism

2020-02-20 Thread Dumitru Ceara
On 2/19/20 4:37 PM, lmart...@redhat.com wrote:
> From: Lucas Alvares Gomes 
> 
> NOTE: SENDING THIS PATCH AS A RFC TO SEE WHAT OTHERS MIGHT THINK ABOUT
> THE IDEA PRIOR TO WRITTING TESTS TO IT.
> 
> CMSes integrating with OVN often uses the nb_cfg mechanism as a way to
> check the health status of the ovn-controller process but, the current
> implementation isn't ideal for that purpose because it floods the control
> plane with update notifications every time the nb_cfg value is
> incremented.
> 
> This patch is merging two ideas:
> 
> 1) Han Zhou proposed a patch [0] creating a table called Chassis_Private
>where each hypervisor *only* writes and monitors its own record to
>avoid this flooding problem.
> 
> 2) Having ovn-controller to periodically log that it's alive instead of
>relying on the nb_cfg mechanism.
> 
> By using this mechanism, a CMS can more easily just read this new
> Chassis_Private table and figure out the status of each Chassis
> (ovn-controller) in the cluster.
> 

Hi Lucas,

Thanks a lot for working on this!

> Here's some reasons why I believe this approach is better than having
> to bump the  nb_cfg value:
> 
> 1) Simple integration. Before, the CMS had to increment the nb_cfg
>value in the NB_Global table and wait for it to be propagated to the
>Southbound database (ovn-northd and then ovn-controllers) Chassis
>table in order to know the status of the Chassis.
> 
>Now, it can just read the Chassis_Private table and compare the
>alive_at column against the current time.

I guess the only comment I have about this approach is that it doesn't
feel completely OK to store a timestamp representation as string in
"alive_at". I'm afraid this might be too inflexible and force CMSes to
compare their local time with the value in this column.

I know we discussed offline that using a counter that is periodically
incremented by ovn-controller instead of a timestamp would complicate
the implementation in Openstack but maybe other people on this mailing
list have ideas on how to deal with this in a more generic way.

> 
> 2) Less costy. Just one read from the db is needed, no writing. Code
>using the nb_cfg mechanism had to implement a few safe-guard code to
>make less error prone. See [1] and [2] for example.
> 
> 3) Backwards compatibility. The patch [0] was moving the nb_cfg value
>to new table so, systems relying on it would need to update their
>code upon updating OVN.

Nice!

One more minor comment on the code, inline.

Thanks,
Dumitru

> 
> To enable this new mechanism set an option called
> chassis_liveness_interval=N to the "options" column in the NB_Global
> table, where N is the time (in seconds) in which the ovn-controller
> should updates the alive_at column. If not set or set to 0, the
> mechanism is disabled. By default, it's disabled.
> 
> [0] https://patchwork.ozlabs.org/patch/899608/
> [1] https://review.opendev.org/#/c/703612/
> [2] https://review.opendev.org/#/c/704530/
> 
> Co-authored-by: Han Zhou 
> Co-authored-by: Numan Siddique 
> Co-authored-by: Dumitru Ceara 
> Signed-off-by: Lucas Alvares Gomes 
> ---
>  controller/chassis.c| 56 +++--
>  controller/chassis.h| 11 ++--
>  controller/ovn-controller.c | 44 +++--
>  lib/chassis-index.c | 26 +
>  lib/chassis-index.h |  5 
>  northd/ovn-northd.c |  4 +++
>  ovn-sb.ovsschema| 13 +++--
>  ovn-sb.xml  | 35 +++
>  8 files changed, 185 insertions(+), 9 deletions(-)
> 
> diff --git a/controller/chassis.c b/controller/chassis.c
> index 522893ead..7892cb966 100644
> --- a/controller/chassis.c
> +++ b/controller/chassis.c
> @@ -24,6 +24,7 @@
>  #include "openvswitch/dynamic-string.h"
>  #include "openvswitch/vlog.h"
>  #include "openvswitch/ofp-parse.h"
> +#include "openvswitch/poll-loop.h"
>  #include "lib/chassis-index.h"
>  #include "lib/ovn-sb-idl.h"
>  #include "ovn-controller.h"
> @@ -47,11 +48,16 @@ struct chassis_info {
>  
>  /* True if Chassis SB record is initialized, false otherwise. */
>  uint32_t id_inited : 1;
> +
> +/* Next livennes update time to update the 'alive_at' column of
> + * Chassis_Private table. */
> +long long int next_liveness_update;
>  };
>  
>  static struct chassis_info chassis_state = {
>  .id = DS_EMPTY_INITIALIZER,
>  .id_inited = false,
> +.next_liveness_update = LLONG_MAX,
>  };
>  
>  static void
> @@ -581,18 +587,38 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
>  free(encaps);
>  }
>  
> +static void
> +chassis_update_liveness_cfg(const struct sbrec_chassis_private *chassis_rec,
> +struct chassis_info *ch_info,
> +int chassis_liveness_interval)
> +{
> +if (ch_info->next_liveness_update == LLONG_MAX ||
> +time_msec() > ch_info->next_liveness_update) {
> +sb