Re: [ovs-discuss] TCP TLV option population by OVS?
Thanks Ben for replying. It seems we can have this as user may need to change or add TCP option. For example, the Load Balance Fullnat mode, if we want to use OVS to implement LB Fullnat function, we can choose to add client IP address in TCP option Thanks, Daniel. > On Nov 1, 2018, at 1:37 AM, Ben Pfaff wrote: > > On Wed, Oct 31, 2018 at 01:23:15PM +0800, benli ye wrote: >> Does anyone know if OVS supports to add TLV option for TCP header now? > > No, it doesn't. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OVN SB DB server overload when restarted at large scale environment
On Tue, Oct 30, 2018 at 11:51:05PM -0700, Han Zhou wrote: > On Tue, Oct 30, 2018 at 11:15 AM Ben Pfaff wrote: > > > > On Wed, Oct 24, 2018 at 05:42:15PM -0700, Han Zhou wrote: > > > On Tue, Sep 25, 2018 at 10:18 AM Han Zhou wrote: > > > > > > > > > > > > > > > > On Thu, Sep 20, 2018 at 4:43 PM Ben Pfaff wrote: > > > > > > > > > > On Thu, Sep 13, 2018 at 12:28:27PM -0700, Han Zhou wrote: > > > > > > In scalability test with ovn-scale-test, ovsdb-server SB load is > not a > > > > > > problem at least with 1k HVs. However, if we restart the > ovsdb-server, > > > > > > depending on the number of HVs and scale of logical objects, e.g. > the > > > > > > number of logical ports, ovsdb-server of SB become an obvious > > > bottleneck. > > > > > > > > > > > > In our test with 1k HVs and 20k logical ports (200 lport * 100 > > > lswitches > > > > > > connected by one single logical router). Restarting ovsdb-server > of SB > > > > > > resulted in 100% CPU of ovsdb-server for more than 1 hour. All HVs > > > (and > > > > > > northd) are reconnecting and resyncing the big amount of data at > the > > > same > > > > > > time. Considering the amount of data and json rpc cost, this is > not > > > > > > surprising. > > > > > > > > > > > > At this scale, SB ovsdb-server process has RES 303848KB before > > > restart. It > > > > > > is likely a big proportion of this size is SB DB data that is > going > > > to be > > > > > > transferred to all 1,001 clients, which is about 300GB. With a > 10Gbps > > > NIC, > > > > > > even the pure network transmission would take ~5 minutes. > Considering > > > the > > > > > > actual size of JSON RPC would be much bigger than the raw data, > and > > > the > > > > > > processing cost of the single thread ovsdb-server, 1 hour is > > > reasonable. > > > > > > > > > > > > In addition to the CPU cost of ovsdb-server, the memory > consumption > > > could > > > > > > also be a problem. Since all clients are syncing data from it, > > > probably due > > > > > > to the buffering, RES increases quickly, spiked to 10G at some > point. > > > After > > > > > > all the syncing finished, the RES is back to the similar size as > > > before > > > > > > restart. The client side (ovn-controller, northd) were also seeing > > > memory > > > > > > spike - it is a huge JSON RPC for the new snapshot of the whole > DB to > > > be > > > > > > downloaded, so it is just buffered until the whole message is > > > received - > > > > > > RES peaked at the doubled size of its original size, and then went > > > back to > > > > > > the original size after the first round of processing of the new > > > snapshot. > > > > > > This means for deploying OVN, this memory spike should be > considered > > > for > > > > > > the SB DB restart scenario, especially the central node. > > > > > > > > > > > > Here is some of my brainstorming of how could we improve on this > (very > > > > > > rough ones at this stage). > > > > > > There are two directions: 1) reducing the size of data to be > > > transferred. > > > > > > 2) scaling out ovsdb-server. > > > > > > > > > > > > 1) Reducing the size of data to be transferred. > > > > > > > > > > > > 1.1) Using BSON instead of JSON. It could reduce the size of data, > > > but not > > > > > > sure yet how much it could help since most of the data are > strings. It > > > > > > might be even worse since the bottleneck is not yet the network > > > bandwidth > > > > > > but processing power of ovsdb-server. > > > > > > > > > > > > 1.2) Move northd processing to HVs - only relevant NB data needs > to be > > > > > > transfered, which is much smaller than the SB DB because there is > no > > > > > > logical flows. However, this would lead to more processing load on > > > > > > ovn-controller on HVs. Also, it is a big/huge architecture change. > > > > > > > > > > > > 1.3) Incremental data transfer. The way IDL works is like a cache. > > > Now when > > > > > > connection reset the cache has to be rebuilt. But if we know the > > > version > > > > > > the current snapshot, even when connection is reset, the client > can > > > still > > > > > > communicate with the newly started server to tell the difference > of > > > the > > > > > > current data and the new data, so that only the delta is > transferred, > > > as if > > > > > > the server is not restarted at all. > > > > > > > > > > > > 2) Scaling out the ovsdb-server. > > > > > > > > > > > > 2.1) Currently ovsdb-server is single threaded, so that single > thread > > > has > > > > > > to take care of transmission to all clients with 100% CPU. If it > is > > > > > > mutli-threaded, more cores can be utilized to make this much > faster. > > > > > > > > > > > > 2.2) Using ovsdb cluster. This feature is supported already but I > > > haven't > > > > > > tested it in this scenario yet. If everything works as expected, > > > there can > > > > > > be 3 - 5 servers sharing the load, so the transfer should be > > > completed 3 - > > > > > > 5 times faster than it is right
Re: [ovs-discuss] OVS local port overloaded
On Tue, Oct 30, 2018 at 10:29:37AM +0700, Soe Ye Htet wrote: > Dear OvS Team, > > I have one problem in openvswitch. Let me state my simple tested toplogy. > OVS1(RYU)---OVS2. Instead of applying ovs inband mode, I configure my > own predefiend fules in OvS1 & 2 to apply in band scemario according to my > work. RYU controller can connect successfully to OvS1 and 2. Then Iperf3 > connection has been established between OvS1 and OvS2. OvS1 is a server and > OvS2 is a receiver. After sometime, Iperf3 connection is broken and the > local port ftom OVS2 cannot transmit packet. See if there is any relevant message in the ovs-vswitchd.log or journal. -- Flavio ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [ovs-dev] Geneve remote_ip as flow for OVN hosts
Honestly the best thing to do is probably to propose a design or, if it's simple enough, to send a patch. That will probably be more effective at sparking a discussion. On Wed, Oct 31, 2018 at 03:33:48PM +, venugopal iyer wrote: > Hi: > Just wanted to check if folks had any thoughts on the use case Girish > outlined below. We do have > a real use case for this and are interested in looking at options for > supporting more than one VTEP IP.It is currently a limitation for us, wanted > to know if there are similar use cases folks are looking at/interested in > addressing. > > thanks, > -venu > > On Thursday, September 6, 2018, 9:19:01 AM PDT, venugopal iyer via dev > wrote: > > Would it be possible for the association to be > made > when the logical port is instantiated on a node? and relayed on to the SB by > the controller, e.g. assuming a mechanism to specify/determine a physical > port mapping for a > logical port for a VM. The mappings can be > specified as > configuration on the chassis. In the absence of physical port information for > a logical port/VM, I suppose we could default to an encap-ip. > > > just a thought, > -venu > On Wednesday, September 5, 2018, 2:03:35 PM PDT, Ben Pfaff > wrote: > > How would OVN know which IP to use for a given logical port on a > chassis? > > I think that the "multiple tunnel encapsulations" is meant to cover, > say, Geneve vs. STT vs. VXLAN, not the case you have in mind. > > On Wed, Sep 05, 2018 at 09:50:32AM -0700, Girish Moodalbail wrote: > > Hello all, > > > > I would like to add more context here. In the diagram below > > > > +--+ > > |ovn-host | > > | | > > | | > > | +-+| > > | | br-int || > > | ++-+--+| > > | | | | > > | +--v-+ +---v+ | > > | | geneve | | geneve | | > > | +--+-+ +---++ | > > | | | | > > | +-v+ +--v---+ | > > | | IP0 | | IP1 | | > > | +--+ +--+ | > > +--+ eth0 +-+ eth1 +---+ > > +--+ +--+ > > > > eth0 and eth are, say, in its own physical segments. The VMs that are > > instantiated in the above ovn-host will have multiple interfaces and each > > of those interface need to be on a different Geneve VTEP. > > > > I think the following entry in OVN TODOs ( > > https://github.com/openvswitch/ovs/blob/master/ovn/TODO.rst) > > > > ---8<--8<--- > > Support multiple tunnel encapsulations in Chassis. > > > > So far, both ovn-controller and ovn-controller-vtep only allow chassis to > > have one tunnel encapsulation entry. We should extend the implementation to > > support multiple tunnel encapsulations > > ---8<--8<--- > > > > captures the above requirement. Is that the case? > > > > Thanks again. > > > > Regards, > > ~Girish > > > > > > > > > > On Tue, Sep 4, 2018 at 3:00 PM Girish Moodalbail > > wrote: > > > > > Hello all, > > > > > > Is it possible to configure remote_ip as a 'flow' instead of an IP address > > > (i.e., setting ovn-encap-ip to a single IP address)? > > > > > > Today, we have one VTEP endpoint per OVN host and all the VMs that > > > connects to br-int on that OVN host are reachable behind this VTEP > > > endpoint. Is it possible to have multiple VTEP endpoints for a br-int > > > bridge and use Open Flow flows to select one of the VTEP endpoint? > > > > > > > > > +--+ > > > |ovn-host | > > > | | > > > | | > > > | +-+| > > > | | br-int || > > > | ++-+--+| > > > | | | | > > > | +--v-+ +---v+ | > > > | | geneve | | geneve | | > > > | +--+-+ +---++ | > > > | | | | > > > | +-v+ +--v---+ | > > > | | IP0 | | IP1 | | > > > | +--+ +--+ | > > > +--+ eth0 +-+ eth1 +---+ > > > +--+ +--+ > > > > > > Also, we don't want to bond eth0 and eth1 into a bond interface and then > > > use bond's IP as VTEP endpoint. > > > > > > Thanks in advance, > > > ~Girish > > > > > > > > > > > > > > > ___ > > discuss mailing list > > disc...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > ___ > dev mailing list
Re: [ovs-discuss] TCP TLV option population by OVS?
On Wed, Oct 31, 2018 at 01:23:15PM +0800, benli ye wrote: > Does anyone know if OVS supports to add TLV option for TCP header now? No, it doesn't. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OvS using newer DPDK
Guys, Any comments to OVS upgrade to dpdk 18.08? https://patchwork.ozlabs.org/project/openvswitch/list/?series=72606 Regards, Ophir > -Original Message- > From: Stokes, Ian [mailto:ian.sto...@intel.com] > Sent: Wednesday, October 31, 2018 5:52 PM > To: Andrzej Ostruszka ; ovs-discuss@openvswitch.org > Cc: Ophir Munk > Subject: RE: [ovs-discuss] OvS using newer DPDK > > > Hello all, > > > > I remember some time ago there was topic raised here about new LTS > > release. I'd like to ask related question - what version of DPDK will > > it be based on? 18.11 (which is going to be new LTS release of DPDK)? > > > > Yes, the plan would be ideally to move to DPDK 18.11. > > > If it is then is there anybody already working on that? > > Yes, the dpdk_latest branch was setup for this purpose. > > There are patches submitted to move OVS to use DPDK 18.08 first. From > there a new set of patches will be created to move to DPDK 18.11. Once > there is agreement and sign off from the OVS DPDK community we would > look to apply those to the OVS master branch in time for the OVS 2.11 > release. > > > > > I'm asking these questions since I've nailed the reason for getting > > OvS crashes on Marvell Armada 8K board. They are while attempting to > > set MTU and there are some patches affecting MTU/MRU calculations that > might help. > > Are these patches targeted at OVS project or the DPDK project? > > > So basically I might attempt to backport them or try to get OvS > > working with newer DPDK. > > OVS is moving towards using DPDK LTS releases only for OVS releases and > the master branch. > > If the patches target DPDK then they could be backported to the relevant > DPDK LTS releases. Once in place there you could also backport support to > OVS 2.9 and OVS 2.10 which use DPDK 17.11. > > > Since I prefer the latter I would like to join somebody doing this > > update (I don't feel comfortable enough with OvS to do that on my > > own). > > Ok sure, there is not a patch to make DPDK use 18.11 yet. That's in progress. > I've cc'd Ophir who has been looking at this to date. Once there is a patch > for > 18.11 if you could test it with the Marvell device that would be great help. > > Thanks > Ian > > > > Best regards > > Andrzej > > ___ > > discuss mailing list > > disc...@openvswitch.org > > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmai > > l.openvswitch.org%2Fmailman%2Flistinfo%2Fovs- > discussdata=02%7C01% > > > 7Cophirmu%40mellanox.com%7C331523e48ebe430445d008d63f48bffd%7C > a652971c > > > 7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636765979023036325sda > ta=WbdP%2 > > > FAlmdnLB%2FkX1DeK%2F9vHN3oaBD2DWrXKyG%2Bc7uzQ%3Dreserv > ed=0 ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [ovs-dev] Geneve remote_ip as flow for OVN hosts
Hi: Just wanted to check if folks had any thoughts on the use case Girish outlined below. We do have a real use case for this and are interested in looking at options for supporting more than one VTEP IP.It is currently a limitation for us, wanted to know if there are similar use cases folks are looking at/interested in addressing. thanks, -venu On Thursday, September 6, 2018, 9:19:01 AM PDT, venugopal iyer via dev wrote: Would it be possible for the association to be made when the logical port is instantiated on a node? and relayed on to the SB by the controller, e.g. assuming a mechanism to specify/determine a physical port mapping for a logical port for a VM. The mappings can be specified as configuration on the chassis. In the absence of physical port information for a logical port/VM, I suppose we could default to an encap-ip. just a thought, -venu On Wednesday, September 5, 2018, 2:03:35 PM PDT, Ben Pfaff wrote: How would OVN know which IP to use for a given logical port on a chassis? I think that the "multiple tunnel encapsulations" is meant to cover, say, Geneve vs. STT vs. VXLAN, not the case you have in mind. On Wed, Sep 05, 2018 at 09:50:32AM -0700, Girish Moodalbail wrote: > Hello all, > > I would like to add more context here. In the diagram below > > +--+ > |ovn-host | > | | > | | > | +-+| > | | br-int || > | ++-+--+| > | | | | > | +--v-+ +---v+ | > | | geneve | | geneve | | > | +--+-+ +---++ | > | | | | > | +-v+ +--v---+ | > | | IP0 | | IP1 | | > | +--+ +--+ | > +--+ eth0 +-+ eth1 +---+ > +--+ +--+ > > eth0 and eth are, say, in its own physical segments. The VMs that are > instantiated in the above ovn-host will have multiple interfaces and each > of those interface need to be on a different Geneve VTEP. > > I think the following entry in OVN TODOs ( > https://github.com/openvswitch/ovs/blob/master/ovn/TODO.rst) > > ---8<--8<--- > Support multiple tunnel encapsulations in Chassis. > > So far, both ovn-controller and ovn-controller-vtep only allow chassis to > have one tunnel encapsulation entry. We should extend the implementation to > support multiple tunnel encapsulations > ---8<--8<--- > > captures the above requirement. Is that the case? > > Thanks again. > > Regards, > ~Girish > > > > > On Tue, Sep 4, 2018 at 3:00 PM Girish Moodalbail > wrote: > > > Hello all, > > > > Is it possible to configure remote_ip as a 'flow' instead of an IP address > > (i.e., setting ovn-encap-ip to a single IP address)? > > > > Today, we have one VTEP endpoint per OVN host and all the VMs that > > connects to br-int on that OVN host are reachable behind this VTEP > > endpoint. Is it possible to have multiple VTEP endpoints for a br-int > > bridge and use Open Flow flows to select one of the VTEP endpoint? > > > > > > +--+ > > |ovn-host | > > | | > > | | > > | +-+| > > | | br-int || > > | ++-+--+| > > | | | | > > | +--v-+ +---v+ | > > | | geneve | | geneve | | > > | +--+-+ +---++ | > > | | | | > > | +-v+ +--v---+ | > > | | IP0 | | IP1 | | > > | +--+ +--+ | > > +--+ eth0 +-+ eth1 +---+ > > +--+ +--+ > > > > Also, we don't want to bond eth0 and eth1 into a bond interface and then > > use bond's IP as VTEP endpoint. > > > > Thanks in advance, > > ~Girish > > > > > > > > > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OvS using newer DPDK
> Hello all, > > I remember some time ago there was topic raised here about new LTS > release. I'd like to ask related question - what version of DPDK will it > be based on? 18.11 (which is going to be new LTS release of DPDK)? > Yes, the plan would be ideally to move to DPDK 18.11. > If it is then is there anybody already working on that? Yes, the dpdk_latest branch was setup for this purpose. There are patches submitted to move OVS to use DPDK 18.08 first. From there a new set of patches will be created to move to DPDK 18.11. Once there is agreement and sign off from the OVS DPDK community we would look to apply those to the OVS master branch in time for the OVS 2.11 release. > > I'm asking these questions since I've nailed the reason for getting OvS > crashes on Marvell Armada 8K board. They are while attempting to set MTU > and there are some patches affecting MTU/MRU calculations that might help. Are these patches targeted at OVS project or the DPDK project? > So basically I might attempt to backport them or try to get OvS working > with newer DPDK. OVS is moving towards using DPDK LTS releases only for OVS releases and the master branch. If the patches target DPDK then they could be backported to the relevant DPDK LTS releases. Once in place there you could also backport support to OVS 2.9 and OVS 2.10 which use DPDK 17.11. > Since I prefer the latter I would like to join somebody > doing this update (I don't feel comfortable enough with OvS to do that on > my own). Ok sure, there is not a patch to make DPDK use 18.11 yet. That's in progress. I've cc'd Ophir who has been looking at this to date. Once there is a patch for 18.11 if you could test it with the Marvell device that would be great help. Thanks Ian > > Best regards > Andrzej > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] OvS using newer DPDK
Hello all, I remember some time ago there was topic raised here about new LTS release. I'd like to ask related question - what version of DPDK will it be based on? 18.11 (which is going to be new LTS release of DPDK)? If it is then is there anybody already working on that? I'm asking these questions since I've nailed the reason for getting OvS crashes on Marvell Armada 8K board. They are while attempting to set MTU and there are some patches affecting MTU/MRU calculations that might help. So basically I might attempt to backport them or try to get OvS working with newer DPDK. Since I prefer the latter I would like to join somebody doing this update (I don't feel comfortable enough with OvS to do that on my own). Best regards Andrzej ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions
On 10/31/2018 5:40 AM, Jaime Caamaño Ruiz wrote: Greg, I submitted this patch [1], let me know if anything looks bad. [1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353410.html I'll have a look and comment there. Thanks! Thanks Jaime. -Original Message- From: Jaime Caamaño Ruiz Reply-to: jcaam...@suse.com To: Gregory Rose , ovs-discuss@openvswitch.org Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Wed, 31 Oct 2018 12:07:59 +0100 Let me give it a try. Aside for the fix on master, who takes care of mapping the fix to bugfix releases? BR Jaime. -Original Message- From: Gregory Rose To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Tue, 30 Oct 2018 14:42:15 -0700 On 10/29/2018 3:38 AM, Jaime Caamaño Ruiz wrote: Hey Greg. Thanks for helping out. I did build OVS with the fix and it got my problem sorted without causing any additional ones on my environment. Let me know if I can help with anything else. BR Jaime. Jaime, you seem to have identified a bug! Using printks with a simple rule to just decap and then encap an Ethernet header we see this with the code as it is right now: [13568.973807] __ovs_nla_copy_actions:3007 <- decap [13568.973812] __ovs_nla_copy_actions:3012 <- decap succeeds but sets mac_proto = MAC_PROTO_ETHERNET [13568.973815] __ovs_nla_copy_actions:2999 <- encap [13568.973818] openvswitch: netlink: Flow actions may not be safe on all matching packets. <- returns -EINVAL Note that the decap happens at lines 3007-3012 and is successful. However, the very next encap action starting at line 2999 does not finish and returns -EINVAL so a printk at line 3002 does not execute. If I change the code as you suggested the flow of decap/encap works without complaint and without returning -EINVAL: [13838.435051] __ovs_nla_copy_actions:3007 <- decap [13838.435054] __ovs_nla_copy_actions:3012 <-decap succeeds and sets mac_proto = MAC_PROTO_NONE [13838.435055] __ovs_nla_copy_actions:2999 <- encap [13838.435056] __ovs_nla_copy_actions:3002 <- encap succeeds and sets mac_proto = MAC_PROTO_ETHERNET Thank you for finding this bug. Do you wish to send the patch to fix it or would you prefer me to do it? Regards, - Greg -Original Message- From: Gregory Rose To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Fri, 26 Oct 2018 15:42:51 -0700 On 10/19/2018 1:39 AM, Jaime Caamaño Ruiz wrote: Hello When using nsh encapsulation, it's useful to normalize your pipeline to packet_type=nsh, poping an ethernet header on input if necessary and pushing an ethernet header again if required before output. But it seems to be problematic: --- 2018-10-18T13:10:59.196Z|00010|dpif(handler3)|WARN|system@ovs-syste m: execute pop_eth,push_eth(src=fe:16:3e:c1:9e:87,dst=fa:16:3e:c1:9e:87),5 failed (Invalid argument) on packet vlan_tci=0x,dl_src=fa:16:3e:c2:e6:68,dl_dst=fe:16:3e:c2:e6:68,d l_ ty pe=0x894f,nsh_flags=0,nsh_ttl=63,nsh_mdtype=1,nsh_np=3,nsh_spi=0x1a ,n sh _si=254,nsh_c1=0xc0a82a01,nsh_c2=0x3,nsh_c3=0x0,nsh_c4=0x9100,n w_ pr oto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 with metadata skb_priority(0),tunnel(tun_id=0x0,src=192.168.42.1,dst=192.168.42.3 ,t tl =64,tp_src=47656,tp_dst=4789,flags(key)),skb_mark(0),in_port(4) mtu 0 --- Looking at the code datapath/flow_netlink.c @ __ovs_nla_copy_actions: case OVS_ACTION_ATTR_PUSH_ETH: /* Disallow pushing an Ethernet header if one * is already present */ if (mac_proto != MAC_PROTO_NONE) return -EINVAL; mac_proto = MAC_PROTO_NONE; break; case OVS_ACTION_ATTR_POP_ETH: if (mac_proto != MAC_PROTO_ETHERNET) return -EINVAL; if (vlan_tci & htons(VLAN_TAG_PRESENT)) return -EINVAL; mac_proto = MAC_PROTO_ETHERNET; break; Isn't the mac_proto set inverted here, should'nt it look like this? case OVS_ACTION_ATTR_PUSH_ETH: /* Disallow pushing an Ethernet header if one * is already present */ if (mac_proto != MAC_PROTO_NONE) return -EINVAL; mac_proto = MAC_PROTO_ETHERNET; break; case OVS_ACTION_ATTR_POP_ETH: if (mac_proto != MAC_PROTO_ETHERNET) return -EINVAL; if (vlan_tci & htons(VLAN_TAG_PRESENT)) return -EINVAL;
Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions
On 10/31/2018 4:07 AM, Jaime Caamaño Ruiz wrote: Let me give it a try. Aside for the fix on master, who takes care of mapping the fix to bugfix releases? Jaime, The maintainers will take care of backporting to previous release branches where the fix is appropriate. Thanks, - Greg BR Jaime. -Original Message- From: Gregory Rose To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Tue, 30 Oct 2018 14:42:15 -0700 On 10/29/2018 3:38 AM, Jaime Caamaño Ruiz wrote: Hey Greg. Thanks for helping out. I did build OVS with the fix and it got my problem sorted without causing any additional ones on my environment. Let me know if I can help with anything else. BR Jaime. Jaime, you seem to have identified a bug! Using printks with a simple rule to just decap and then encap an Ethernet header we see this with the code as it is right now: [13568.973807] __ovs_nla_copy_actions:3007 <- decap [13568.973812] __ovs_nla_copy_actions:3012 <- decap succeeds but sets mac_proto = MAC_PROTO_ETHERNET [13568.973815] __ovs_nla_copy_actions:2999 <- encap [13568.973818] openvswitch: netlink: Flow actions may not be safe on all matching packets. <- returns -EINVAL Note that the decap happens at lines 3007-3012 and is successful. However, the very next encap action starting at line 2999 does not finish and returns -EINVAL so a printk at line 3002 does not execute. If I change the code as you suggested the flow of decap/encap works without complaint and without returning -EINVAL: [13838.435051] __ovs_nla_copy_actions:3007 <- decap [13838.435054] __ovs_nla_copy_actions:3012 <-decap succeeds and sets mac_proto = MAC_PROTO_NONE [13838.435055] __ovs_nla_copy_actions:2999 <- encap [13838.435056] __ovs_nla_copy_actions:3002 <- encap succeeds and sets mac_proto = MAC_PROTO_ETHERNET Thank you for finding this bug. Do you wish to send the patch to fix it or would you prefer me to do it? Regards, - Greg -Original Message- From: Gregory Rose To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Fri, 26 Oct 2018 15:42:51 -0700 On 10/19/2018 1:39 AM, Jaime Caamaño Ruiz wrote: Hello When using nsh encapsulation, it's useful to normalize your pipeline to packet_type=nsh, poping an ethernet header on input if necessary and pushing an ethernet header again if required before output. But it seems to be problematic: --- 2018-10-18T13:10:59.196Z|00010|dpif(handler3)|WARN|system@ovs-syste m: execute pop_eth,push_eth(src=fe:16:3e:c1:9e:87,dst=fa:16:3e:c1:9e:87),5 failed (Invalid argument) on packet vlan_tci=0x,dl_src=fa:16:3e:c2:e6:68,dl_dst=fe:16:3e:c2:e6:68,d l_ ty pe=0x894f,nsh_flags=0,nsh_ttl=63,nsh_mdtype=1,nsh_np=3,nsh_spi=0x1a ,n sh _si=254,nsh_c1=0xc0a82a01,nsh_c2=0x3,nsh_c3=0x0,nsh_c4=0x9100,n w_ pr oto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 with metadata skb_priority(0),tunnel(tun_id=0x0,src=192.168.42.1,dst=192.168.42.3 ,t tl =64,tp_src=47656,tp_dst=4789,flags(key)),skb_mark(0),in_port(4) mtu 0 --- Looking at the code datapath/flow_netlink.c @ __ovs_nla_copy_actions: case OVS_ACTION_ATTR_PUSH_ETH: /* Disallow pushing an Ethernet header if one * is already present */ if (mac_proto != MAC_PROTO_NONE) return -EINVAL; mac_proto = MAC_PROTO_NONE; break; case OVS_ACTION_ATTR_POP_ETH: if (mac_proto != MAC_PROTO_ETHERNET) return -EINVAL; if (vlan_tci & htons(VLAN_TAG_PRESENT)) return -EINVAL; mac_proto = MAC_PROTO_ETHERNET; break; Isn't the mac_proto set inverted here, should'nt it look like this? case OVS_ACTION_ATTR_PUSH_ETH: /* Disallow pushing an Ethernet header if one * is already present */ if (mac_proto != MAC_PROTO_NONE) return -EINVAL; mac_proto = MAC_PROTO_ETHERNET; break; case OVS_ACTION_ATTR_POP_ETH: if (mac_proto != MAC_PROTO_ETHERNET) return -EINVAL; if (vlan_tci & htons(VLAN_TAG_PRESENT)) return -EINVAL; mac_proto = MAC_PROTO_NONE; break; Jaime, I am looking into this and at first sight this does look inverted but we have no other reported bugs in this area so I want to be careful that we don't break anything else while fixing this. Have you tried building OVS with
Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions
Greg, I submitted this patch [1], let me know if anything looks bad. [1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353410.html Thanks Jaime. -Original Message- From: Jaime Caamaño Ruiz Reply-to: jcaam...@suse.com To: Gregory Rose , ovs-discuss@openvswitch.org Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Wed, 31 Oct 2018 12:07:59 +0100 Let me give it a try. Aside for the fix on master, who takes care of mapping the fix to bugfix releases? BR Jaime. -Original Message- From: Gregory Rose To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Tue, 30 Oct 2018 14:42:15 -0700 On 10/29/2018 3:38 AM, Jaime Caamaño Ruiz wrote: > Hey Greg. Thanks for helping out. I did build OVS with the fix and it > got my problem sorted without causing any additional ones on my > environment. Let me know if I can help with anything else. > > BR > Jaime. Jaime, you seem to have identified a bug! Using printks with a simple rule to just decap and then encap an Ethernet header we see this with the code as it is right now: [13568.973807] __ovs_nla_copy_actions:3007 <- decap [13568.973812] __ovs_nla_copy_actions:3012 <- decap succeeds but sets mac_proto = MAC_PROTO_ETHERNET [13568.973815] __ovs_nla_copy_actions:2999 <- encap [13568.973818] openvswitch: netlink: Flow actions may not be safe on all matching packets. <- returns -EINVAL Note that the decap happens at lines 3007-3012 and is successful. However, the very next encap action starting at line 2999 does not finish and returns -EINVAL so a printk at line 3002 does not execute. If I change the code as you suggested the flow of decap/encap works without complaint and without returning -EINVAL: [13838.435051] __ovs_nla_copy_actions:3007 <- decap [13838.435054] __ovs_nla_copy_actions:3012 <-decap succeeds and sets mac_proto = MAC_PROTO_NONE [13838.435055] __ovs_nla_copy_actions:2999 <- encap [13838.435056] __ovs_nla_copy_actions:3002 <- encap succeeds and sets mac_proto = MAC_PROTO_ETHERNET Thank you for finding this bug. Do you wish to send the patch to fix it or would you prefer me to do it? Regards, - Greg > > > -Original Message- > From: Gregory Rose > To: ovs-discuss@openvswitch.org, jcaam...@suse.de > Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) > actions > Date: Fri, 26 Oct 2018 15:42:51 -0700 > > On 10/19/2018 1:39 AM, Jaime Caamaño Ruiz wrote: > > Hello > > > > When using nsh encapsulation, it's useful to normalize your > > pipeline > > to > > packet_type=nsh, poping an ethernet header on input if necessary > > and > > pushing an ethernet header again if required before output. > > > > But it seems to be problematic: > > > > --- > > 2018-10-18T13:10:59.196Z|00010|dpif(handler3)|WARN|system@ovs-syste > > m: > > execute > > pop_eth,push_eth(src=fe:16:3e:c1:9e:87,dst=fa:16:3e:c1:9e:87),5 > > failed (Invalid argument) on packet > > vlan_tci=0x,dl_src=fa:16:3e:c2:e6:68,dl_dst=fe:16:3e:c2:e6:68,d > > l_ > > ty > > pe=0x894f,nsh_flags=0,nsh_ttl=63,nsh_mdtype=1,nsh_np=3,nsh_spi=0x1a > > ,n > > sh > > _si=254,nsh_c1=0xc0a82a01,nsh_c2=0x3,nsh_c3=0x0,nsh_c4=0x9100,n > > w_ > > pr > > oto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 > >with metadata > > skb_priority(0),tunnel(tun_id=0x0,src=192.168.42.1,dst=192.168.42.3 > > ,t > > tl > > =64,tp_src=47656,tp_dst=4789,flags(key)),skb_mark(0),in_port(4) mtu > > 0 > > --- > > > > Looking at the code datapath/flow_netlink.c @ > > __ovs_nla_copy_actions: > > > > case OVS_ACTION_ATTR_PUSH_ETH: > > /* Disallow pushing an Ethernet header if > > one > >* is already present */ > > if (mac_proto != MAC_PROTO_NONE) > > return -EINVAL; > > mac_proto = MAC_PROTO_NONE; > > break; > > > > case OVS_ACTION_ATTR_POP_ETH: > > if (mac_proto != MAC_PROTO_ETHERNET) > > return -EINVAL; > > if (vlan_tci & htons(VLAN_TAG_PRESENT)) > > return -EINVAL; > > mac_proto = MAC_PROTO_ETHERNET; > > break; > > > > > > Isn't the mac_proto set inverted here, should'nt it look like this? > > > > > > case OVS_ACTION_ATTR_PUSH_ETH: > > /* Disallow pushing an Ethernet header if > > one > >* is already present */ > > if (mac_proto != MAC_PROTO_NONE) > > return -EINVAL; > > mac_proto = MAC_PROTO_ETHERNET; > > break; > > > > case OVS_ACTION_ATTR_POP_ETH: > > if (mac_proto
Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions
Let me give it a try. Aside for the fix on master, who takes care of mapping the fix to bugfix releases? BR Jaime. -Original Message- From: Gregory Rose To: ovs-discuss@openvswitch.org, jcaam...@suse.de Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) actions Date: Tue, 30 Oct 2018 14:42:15 -0700 On 10/29/2018 3:38 AM, Jaime Caamaño Ruiz wrote: > Hey Greg. Thanks for helping out. I did build OVS with the fix and it > got my problem sorted without causing any additional ones on my > environment. Let me know if I can help with anything else. > > BR > Jaime. Jaime, you seem to have identified a bug! Using printks with a simple rule to just decap and then encap an Ethernet header we see this with the code as it is right now: [13568.973807] __ovs_nla_copy_actions:3007 <- decap [13568.973812] __ovs_nla_copy_actions:3012 <- decap succeeds but sets mac_proto = MAC_PROTO_ETHERNET [13568.973815] __ovs_nla_copy_actions:2999 <- encap [13568.973818] openvswitch: netlink: Flow actions may not be safe on all matching packets. <- returns -EINVAL Note that the decap happens at lines 3007-3012 and is successful. However, the very next encap action starting at line 2999 does not finish and returns -EINVAL so a printk at line 3002 does not execute. If I change the code as you suggested the flow of decap/encap works without complaint and without returning -EINVAL: [13838.435051] __ovs_nla_copy_actions:3007 <- decap [13838.435054] __ovs_nla_copy_actions:3012 <-decap succeeds and sets mac_proto = MAC_PROTO_NONE [13838.435055] __ovs_nla_copy_actions:2999 <- encap [13838.435056] __ovs_nla_copy_actions:3002 <- encap succeeds and sets mac_proto = MAC_PROTO_ETHERNET Thank you for finding this bug. Do you wish to send the patch to fix it or would you prefer me to do it? Regards, - Greg > > > -Original Message- > From: Gregory Rose > To: ovs-discuss@openvswitch.org, jcaam...@suse.de > Subject: Re: [ovs-discuss] Problems executing decap(eth)+encap(eth) > actions > Date: Fri, 26 Oct 2018 15:42:51 -0700 > > On 10/19/2018 1:39 AM, Jaime Caamaño Ruiz wrote: > > Hello > > > > When using nsh encapsulation, it's useful to normalize your > > pipeline > > to > > packet_type=nsh, poping an ethernet header on input if necessary > > and > > pushing an ethernet header again if required before output. > > > > But it seems to be problematic: > > > > --- > > 2018-10-18T13:10:59.196Z|00010|dpif(handler3)|WARN|system@ovs-syste > > m: > > execute > > pop_eth,push_eth(src=fe:16:3e:c1:9e:87,dst=fa:16:3e:c1:9e:87),5 > > failed (Invalid argument) on packet > > vlan_tci=0x,dl_src=fa:16:3e:c2:e6:68,dl_dst=fe:16:3e:c2:e6:68,d > > l_ > > ty > > pe=0x894f,nsh_flags=0,nsh_ttl=63,nsh_mdtype=1,nsh_np=3,nsh_spi=0x1a > > ,n > > sh > > _si=254,nsh_c1=0xc0a82a01,nsh_c2=0x3,nsh_c3=0x0,nsh_c4=0x9100,n > > w_ > > pr > > oto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 > >with metadata > > skb_priority(0),tunnel(tun_id=0x0,src=192.168.42.1,dst=192.168.42.3 > > ,t > > tl > > =64,tp_src=47656,tp_dst=4789,flags(key)),skb_mark(0),in_port(4) mtu > > 0 > > --- > > > > Looking at the code datapath/flow_netlink.c @ > > __ovs_nla_copy_actions: > > > > case OVS_ACTION_ATTR_PUSH_ETH: > > /* Disallow pushing an Ethernet header if > > one > >* is already present */ > > if (mac_proto != MAC_PROTO_NONE) > > return -EINVAL; > > mac_proto = MAC_PROTO_NONE; > > break; > > > > case OVS_ACTION_ATTR_POP_ETH: > > if (mac_proto != MAC_PROTO_ETHERNET) > > return -EINVAL; > > if (vlan_tci & htons(VLAN_TAG_PRESENT)) > > return -EINVAL; > > mac_proto = MAC_PROTO_ETHERNET; > > break; > > > > > > Isn't the mac_proto set inverted here, should'nt it look like this? > > > > > > case OVS_ACTION_ATTR_PUSH_ETH: > > /* Disallow pushing an Ethernet header if > > one > >* is already present */ > > if (mac_proto != MAC_PROTO_NONE) > > return -EINVAL; > > mac_proto = MAC_PROTO_ETHERNET; > > break; > > > > case OVS_ACTION_ATTR_POP_ETH: > > if (mac_proto != MAC_PROTO_ETHERNET) > > return -EINVAL; > > if (vlan_tci & htons(VLAN_TAG_PRESENT)) > > return -EINVAL; > > mac_proto = MAC_PROTO_NONE; > > break; > > Jaime, > > I am looking into this and at first sight this does look inverted but > we > have no other
Re: [ovs-discuss] OVS bridges in docker containers segfault when dpdkvhostuser port is added.
> On Thu, Oct 25, 2018 at 09:51:38PM +0200, Alan Kayahan wrote: > > Hello, > > > > I have 3 OVS bridges on the same host, connected to each other as > > br1<->br2<->br3. br1 and br3 are connected to the docker container cA > > via dpdkvhostuser port type (I know it is deprecated, the app works > > this way only). The DPDK app running in cA generate packets, which > > traverse bridges br1->br2->br3, then ends up back at the DPDK app. > > This setup works fine. > > > > Now I am trying to put each OVS bridge into its respective docker > > container. I connect the containers with veth pairs, then add the veth > > ports to the bridges. Next, I add a dpdkvhostuser port named SRC to > > br1, so far so good. The moment I add a dpdkvhostuser port named SNK > > to br3, ovs-vswitchd services in br1's and br3's containers segfault. > > Following are the backtraces from each, What version of OVS and DPDK are you using? > > > > --br1's container--- > > > > [Thread debugging using libthread_db enabled] Using host libthread_db > > library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > Core was generated by `ovs-vswitchd > > unix:/usr/local/var/run/openvswitch/db.sock -vconsole:emer -vsyslo'. > > Program terminated with signal SIGSEGV, Segmentation fault. > > #0 0x5608fa0f321b in netdev_rxq_recv (rx=0x7ff13c34ee80, > > batch=batch@entry=0x7ff1bbb4d890) at lib/netdev.c:702 > > 702retval = rx->netdev->netdev_class->rxq_recv(rx, batch); > > [Current thread is 1 (Thread 0x7ff1bbb4e700 (LWP 376))] > > (gdb) bt > > #0 0x5608fa0f321b in netdev_rxq_recv (rx=0x7ff13c34ee80, > > batch=batch@entry=0x7ff1bbb4d890) at lib/netdev.c:702 > > #1 0x5608fa0cce65 in dp_netdev_process_rxq_port ( > > pmd=pmd@entry=0x7ff1bbb4f010, rxq=0x5608fb651be0, port_no=1) > > at lib/dpif-netdev.c:3279 > > #2 0x5608fa0cd296 in pmd_thread_main (f_=) > > at lib/dpif-netdev.c:4145 > > #3 0x5608fa14a836 in ovsthread_wrapper (aux_=) > > at lib/ovs-thread.c:348 > > #4 0x7ff1c52517fc in start_thread (arg=0x7ff1bbb4e700) > > at pthread_create.c:465 > > #5 0x7ff1c4815b5f in clone () > > at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > > > > --br3's container--- > > > > [Thread debugging using libthread_db enabled] Using host libthread_db > > library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > Core was generated by `ovs-vswitchd > > unix:/usr/local/var/run/openvswitch/db.sock -vconsole:emer -vsyslo'. > > Program terminated with signal SIGSEGV, Segmentation fault. > > #0 0x55c517e3abcb in rte_mempool_free_memchunks () [Current > > thread is 1 (Thread 0x7f202351f300 (LWP 647))] > > (gdb) bt > > #0 0x55c517e3abcb in rte_mempool_free_memchunks () > > #1 0x55c517e3ad46 in rte_mempool_free.part () > > #2 0x55c518218b78 in dpdk_mp_free (mp=0x7f603fe66a00) > > at lib/netdev-dpdk.c:599 > > #3 0x55c518218ff0 in dpdk_mp_free (mp=) > > at lib/netdev-dpdk.c:593 > > #4 netdev_dpdk_mempool_configure (dev=0x7f1f7ffeac00) at > > lib/netdev-dpdk.c:629 > > #5 0x55c51821a98d in dpdk_vhost_reconfigure_helper > (dev=0x7f1f7ffeac00) > > at lib/netdev-dpdk.c:3599 > > #6 0x55c51821ac8b in netdev_dpdk_vhost_reconfigure > (netdev=0x7f1f7ffebcc0) > > at lib/netdev-dpdk.c:3624 > > #7 0x55c51813fe6b in port_reconfigure (port=0x55c51a4522a0) > > at lib/dpif-netdev.c:3341 > > #8 reconfigure_datapath (dp=dp@entry=0x55c51a46efc0) at > > lib/dpif-netdev.c:3822 > > #9 0x55c5181403e8 in do_add_port (dp=dp@entry=0x55c51a46efc0, > > devname=devname@entry=0x55c51a456520 "SNK", > > type=0x55c51834f7bd "dpdkvhostuser", port_no=port_no@entry=1) > > at lib/dpif-netdev.c:1584 > > #10 0x55c51814059b in dpif_netdev_port_add (dpif=, > > netdev=0x7f1f7ffebcc0, port_nop=0x7fffb4eef68c) at > > lib/dpif-netdev.c:1610 > > #11 0x55c5181469be in dpif_port_add (dpif=0x55c51a469350, > > netdev=netdev@entry=0x7f1f7ffebcc0, > port_nop=port_nop@entry=0x7fffb4eef6ec) > > at lib/dpif.c:579 > > ---Type to continue, or q to quit--- > > #12 0x55c5180f9f28 in port_add (ofproto_=0x55c51a464ee0, > > netdev=0x7f1f7ffebcc0) at ofproto/ofproto-dpif.c:3645 > > #13 0x55c5180ecafe in ofproto_port_add (ofproto=0x55c51a464ee0, > > netdev=0x7f1f7ffebcc0, ofp_portp=ofp_portp@entry=0x7fffb4eef7e8) at > > ofproto/ofproto.c:1999 > > #14 0x55c5180d97e6 in iface_do_create (errp=0x7fffb4eef7f8, > > netdevp=0x7fffb4eef7f0, ofp_portp=0x7fffb4eef7e8, > > iface_cfg=0x55c51a46d590, br=0x55c51a4415b0) > > at vswitchd/bridge.c:1799 > > #15 iface_create (port_cfg=0x55c51a46e210, iface_cfg=0x55c51a46d590, > > br=0x55c51a4415b0) at vswitchd/bridge.c:1837 > > #16 bridge_add_ports__ (br=br@entry=0x55c51a4415b0, > > wanted_ports=wanted_ports@entry=0x55c51a441690, > > with_requested_port=with_requested_port@entry=true) at > > vswitchd/bridge.c:931 > > #17 0x55c5180db87a in bridge_add_ports > >
Re: [ovs-discuss] OVN SB DB server overload when restarted at large scale environment
On Tue, Oct 30, 2018 at 11:15 AM Ben Pfaff wrote: > > On Wed, Oct 24, 2018 at 05:42:15PM -0700, Han Zhou wrote: > > On Tue, Sep 25, 2018 at 10:18 AM Han Zhou wrote: > > > > > > > > > > > > On Thu, Sep 20, 2018 at 4:43 PM Ben Pfaff wrote: > > > > > > > > On Thu, Sep 13, 2018 at 12:28:27PM -0700, Han Zhou wrote: > > > > > In scalability test with ovn-scale-test, ovsdb-server SB load is not a > > > > > problem at least with 1k HVs. However, if we restart the ovsdb-server, > > > > > depending on the number of HVs and scale of logical objects, e.g. the > > > > > number of logical ports, ovsdb-server of SB become an obvious > > bottleneck. > > > > > > > > > > In our test with 1k HVs and 20k logical ports (200 lport * 100 > > lswitches > > > > > connected by one single logical router). Restarting ovsdb-server of SB > > > > > resulted in 100% CPU of ovsdb-server for more than 1 hour. All HVs > > (and > > > > > northd) are reconnecting and resyncing the big amount of data at the > > same > > > > > time. Considering the amount of data and json rpc cost, this is not > > > > > surprising. > > > > > > > > > > At this scale, SB ovsdb-server process has RES 303848KB before > > restart. It > > > > > is likely a big proportion of this size is SB DB data that is going > > to be > > > > > transferred to all 1,001 clients, which is about 300GB. With a 10Gbps > > NIC, > > > > > even the pure network transmission would take ~5 minutes. Considering > > the > > > > > actual size of JSON RPC would be much bigger than the raw data, and > > the > > > > > processing cost of the single thread ovsdb-server, 1 hour is > > reasonable. > > > > > > > > > > In addition to the CPU cost of ovsdb-server, the memory consumption > > could > > > > > also be a problem. Since all clients are syncing data from it, > > probably due > > > > > to the buffering, RES increases quickly, spiked to 10G at some point. > > After > > > > > all the syncing finished, the RES is back to the similar size as > > before > > > > > restart. The client side (ovn-controller, northd) were also seeing > > memory > > > > > spike - it is a huge JSON RPC for the new snapshot of the whole DB to > > be > > > > > downloaded, so it is just buffered until the whole message is > > received - > > > > > RES peaked at the doubled size of its original size, and then went > > back to > > > > > the original size after the first round of processing of the new > > snapshot. > > > > > This means for deploying OVN, this memory spike should be considered > > for > > > > > the SB DB restart scenario, especially the central node. > > > > > > > > > > Here is some of my brainstorming of how could we improve on this (very > > > > > rough ones at this stage). > > > > > There are two directions: 1) reducing the size of data to be > > transferred. > > > > > 2) scaling out ovsdb-server. > > > > > > > > > > 1) Reducing the size of data to be transferred. > > > > > > > > > > 1.1) Using BSON instead of JSON. It could reduce the size of data, > > but not > > > > > sure yet how much it could help since most of the data are strings. It > > > > > might be even worse since the bottleneck is not yet the network > > bandwidth > > > > > but processing power of ovsdb-server. > > > > > > > > > > 1.2) Move northd processing to HVs - only relevant NB data needs to be > > > > > transfered, which is much smaller than the SB DB because there is no > > > > > logical flows. However, this would lead to more processing load on > > > > > ovn-controller on HVs. Also, it is a big/huge architecture change. > > > > > > > > > > 1.3) Incremental data transfer. The way IDL works is like a cache. > > Now when > > > > > connection reset the cache has to be rebuilt. But if we know the > > version > > > > > the current snapshot, even when connection is reset, the client can > > still > > > > > communicate with the newly started server to tell the difference of > > the > > > > > current data and the new data, so that only the delta is transferred, > > as if > > > > > the server is not restarted at all. > > > > > > > > > > 2) Scaling out the ovsdb-server. > > > > > > > > > > 2.1) Currently ovsdb-server is single threaded, so that single thread > > has > > > > > to take care of transmission to all clients with 100% CPU. If it is > > > > > mutli-threaded, more cores can be utilized to make this much faster. > > > > > > > > > > 2.2) Using ovsdb cluster. This feature is supported already but I > > haven't > > > > > tested it in this scenario yet. If everything works as expected, > > there can > > > > > be 3 - 5 servers sharing the load, so the transfer should be > > completed 3 - > > > > > 5 times faster than it is right now. However, this is a limit of how > > many > > > > > nodes there can be in a cluster, so the problem can be alleviated but > > may > > > > > still be a problem if the data size goes bigger. > > > > > > > > > > 2.3) Using readonly copies of ovsdb replications. If ovn-controller > > > > > connects to readonly copies,