Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Thanks a lot Han! Great and swift work! I'm testing them now, will let you know ASAP. On Thu, Mar 1, 2018 at 8:39 AM, Han Zhou wrote: > > > On Mon, Feb 26, 2018 at 12:05 PM, Ben Pfaff wrote: > > > > On Fri, Feb 23, 2018 at 03:51:28PM -0800, Han Zhou wrote: > > > On Fri, Feb 23, 2018 at 2:17 PM, Ben Pfaff wrote: > > > > > > > > On Tue, Feb 20, 2018 at 08:56:42AM -0800, Han Zhou wrote: > > > > > On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: > > > > > > > > > > > > On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez > > > wrote: > > > > > > > @Han, I can try rebase the patch if you want but that was > > > > > > > basically renaming the Address_Set table and from Ben's > > > > > > > comment, it may be better to keep the name. Not sure, > > > > > > > however, how we can proceed to address Lucas' points in > > > > > > > this thread. > > > > > > > > > > > > I wouldn't rename the table. It sounds like the priority should > be to > > > > > > add support for sets of port names. I thought that there was > already > > > a > > > > > > patch for that to be rebased, but maybe I misunderstood. > > > > > > > > > > I feel it is better to add a new table for port group explicitly, > and > > > the > > > > > column type can be a set of weak reference to Logical_Switch_Port. > > > > > The benefits are: > > > > > - Better data integrity: deleting a lport automatically deletes > from the > > > > > port group > > > > > - No confusion about the type of records in a single table > > > > > - Existing Address_Set mechanism will continue to be supported > without > > > any > > > > > change > > > > > - Furthermore, the race condition issue brought up by Lucas can be > > > solved > > > > > by supporting port-group in IP address match condition in > > > ovn-controller, > > > > > so that all addresses in the lports are used just like how > AddressSet is > > > > > used today. And there is no need for Neutron networking-ovn to use > > > > > AddressSet any more. Since addresses are deduced from lports, the > > > ordering > > > > > of deleting/adding doesn't matter any more. > > > > > > > > > > How does this sound? > > > > > > > > Will we want sets of Logical_Router_Ports later? > > > At least I don't see any use case in Neutron for router ports since > > > Security Group is only for VIF ports. > > > > > > There is another tricky point I see while working on implementation. In > > > Neutron, SG can be applied to ports across different networks, but in > OVN > > > lports works only on its own datapath, so in ovn-controller we need to > be > > > able to pickup related ports from the port-group when translating > lflows > > > for each datapath. I hope this is not an issue. Otherwise, Neutron > plugin > > > will have to divide the group of ports into sub-groups according to the > > > lswitch they belong to, which would be a pain. > > > > I think that we can make ovn-controller gracefully tolerate that. > > > > Let's try this implementation. I'm not excited about having a new table > > for this purpose, but it sounds like the advantages may be worthwhile. > > Here are the patches: https://patchwork.ozlabs.org/ > project/openvswitch/list/?series=31165 > > Thanks, > Han > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Mon, Feb 26, 2018 at 12:05 PM, Ben Pfaff wrote: > > On Fri, Feb 23, 2018 at 03:51:28PM -0800, Han Zhou wrote: > > On Fri, Feb 23, 2018 at 2:17 PM, Ben Pfaff wrote: > > > > > > On Tue, Feb 20, 2018 at 08:56:42AM -0800, Han Zhou wrote: > > > > On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: > > > > > > > > > > On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez > > wrote: > > > > > > @Han, I can try rebase the patch if you want but that was > > > > > > basically renaming the Address_Set table and from Ben's > > > > > > comment, it may be better to keep the name. Not sure, > > > > > > however, how we can proceed to address Lucas' points in > > > > > > this thread. > > > > > > > > > > I wouldn't rename the table. It sounds like the priority should be to > > > > > add support for sets of port names. I thought that there was already > > a > > > > > patch for that to be rebased, but maybe I misunderstood. > > > > > > > > I feel it is better to add a new table for port group explicitly, and > > the > > > > column type can be a set of weak reference to Logical_Switch_Port. > > > > The benefits are: > > > > - Better data integrity: deleting a lport automatically deletes from the > > > > port group > > > > - No confusion about the type of records in a single table > > > > - Existing Address_Set mechanism will continue to be supported without > > any > > > > change > > > > - Furthermore, the race condition issue brought up by Lucas can be > > solved > > > > by supporting port-group in IP address match condition in > > ovn-controller, > > > > so that all addresses in the lports are used just like how AddressSet is > > > > used today. And there is no need for Neutron networking-ovn to use > > > > AddressSet any more. Since addresses are deduced from lports, the > > ordering > > > > of deleting/adding doesn't matter any more. > > > > > > > > How does this sound? > > > > > > Will we want sets of Logical_Router_Ports later? > > At least I don't see any use case in Neutron for router ports since > > Security Group is only for VIF ports. > > > > There is another tricky point I see while working on implementation. In > > Neutron, SG can be applied to ports across different networks, but in OVN > > lports works only on its own datapath, so in ovn-controller we need to be > > able to pickup related ports from the port-group when translating lflows > > for each datapath. I hope this is not an issue. Otherwise, Neutron plugin > > will have to divide the group of ports into sub-groups according to the > > lswitch they belong to, which would be a pain. > > I think that we can make ovn-controller gracefully tolerate that. > > Let's try this implementation. I'm not excited about having a new table > for this purpose, but it sounds like the advantages may be worthwhile. Here are the patches: https://patchwork.ozlabs.org/project/openvswitch/list/?series=31165 Thanks, Han ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Fri, Feb 23, 2018 at 03:51:28PM -0800, Han Zhou wrote: > On Fri, Feb 23, 2018 at 2:17 PM, Ben Pfaff wrote: > > > > On Tue, Feb 20, 2018 at 08:56:42AM -0800, Han Zhou wrote: > > > On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: > > > > > > > > On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez > wrote: > > > > > @Han, I can try rebase the patch if you want but that was > > > > > basically renaming the Address_Set table and from Ben's > > > > > comment, it may be better to keep the name. Not sure, > > > > > however, how we can proceed to address Lucas' points in > > > > > this thread. > > > > > > > > I wouldn't rename the table. It sounds like the priority should be to > > > > add support for sets of port names. I thought that there was already > a > > > > patch for that to be rebased, but maybe I misunderstood. > > > > > > I feel it is better to add a new table for port group explicitly, and > the > > > column type can be a set of weak reference to Logical_Switch_Port. > > > The benefits are: > > > - Better data integrity: deleting a lport automatically deletes from the > > > port group > > > - No confusion about the type of records in a single table > > > - Existing Address_Set mechanism will continue to be supported without > any > > > change > > > - Furthermore, the race condition issue brought up by Lucas can be > solved > > > by supporting port-group in IP address match condition in > ovn-controller, > > > so that all addresses in the lports are used just like how AddressSet is > > > used today. And there is no need for Neutron networking-ovn to use > > > AddressSet any more. Since addresses are deduced from lports, the > ordering > > > of deleting/adding doesn't matter any more. > > > > > > How does this sound? > > > > Will we want sets of Logical_Router_Ports later? > At least I don't see any use case in Neutron for router ports since > Security Group is only for VIF ports. > > There is another tricky point I see while working on implementation. In > Neutron, SG can be applied to ports across different networks, but in OVN > lports works only on its own datapath, so in ovn-controller we need to be > able to pickup related ports from the port-group when translating lflows > for each datapath. I hope this is not an issue. Otherwise, Neutron plugin > will have to divide the group of ports into sub-groups according to the > lswitch they belong to, which would be a pain. I think that we can make ovn-controller gracefully tolerate that. Let's try this implementation. I'm not excited about having a new table for this purpose, but it sounds like the advantages may be worthwhile. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Fri, Feb 23, 2018 at 2:17 PM, Ben Pfaff wrote: > > On Tue, Feb 20, 2018 at 08:56:42AM -0800, Han Zhou wrote: > > On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: > > > > > > On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez wrote: > > > > @Han, I can try rebase the patch if you want but that was > > > > basically renaming the Address_Set table and from Ben's > > > > comment, it may be better to keep the name. Not sure, > > > > however, how we can proceed to address Lucas' points in > > > > this thread. > > > > > > I wouldn't rename the table. It sounds like the priority should be to > > > add support for sets of port names. I thought that there was already a > > > patch for that to be rebased, but maybe I misunderstood. > > > > I feel it is better to add a new table for port group explicitly, and the > > column type can be a set of weak reference to Logical_Switch_Port. > > The benefits are: > > - Better data integrity: deleting a lport automatically deletes from the > > port group > > - No confusion about the type of records in a single table > > - Existing Address_Set mechanism will continue to be supported without any > > change > > - Furthermore, the race condition issue brought up by Lucas can be solved > > by supporting port-group in IP address match condition in ovn-controller, > > so that all addresses in the lports are used just like how AddressSet is > > used today. And there is no need for Neutron networking-ovn to use > > AddressSet any more. Since addresses are deduced from lports, the ordering > > of deleting/adding doesn't matter any more. > > > > How does this sound? > > Will we want sets of Logical_Router_Ports later? At least I don't see any use case in Neutron for router ports since Security Group is only for VIF ports. There is another tricky point I see while working on implementation. In Neutron, SG can be applied to ports across different networks, but in OVN lports works only on its own datapath, so in ovn-controller we need to be able to pickup related ports from the port-group when translating lflows for each datapath. I hope this is not an issue. Otherwise, Neutron plugin will have to divide the group of ports into sub-groups according to the lswitch they belong to, which would be a pain. Thanks, Han ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Tue, Feb 20, 2018 at 08:56:42AM -0800, Han Zhou wrote: > On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: > > > > On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez wrote: > > > @Han, I can try rebase the patch if you want but that was > > > basically renaming the Address_Set table and from Ben's > > > comment, it may be better to keep the name. Not sure, > > > however, how we can proceed to address Lucas' points in > > > this thread. > > > > I wouldn't rename the table. It sounds like the priority should be to > > add support for sets of port names. I thought that there was already a > > patch for that to be rebased, but maybe I misunderstood. > > I feel it is better to add a new table for port group explicitly, and the > column type can be a set of weak reference to Logical_Switch_Port. > The benefits are: > - Better data integrity: deleting a lport automatically deletes from the > port group > - No confusion about the type of records in a single table > - Existing Address_Set mechanism will continue to be supported without any > change > - Furthermore, the race condition issue brought up by Lucas can be solved > by supporting port-group in IP address match condition in ovn-controller, > so that all addresses in the lports are used just like how AddressSet is > used today. And there is no need for Neutron networking-ovn to use > AddressSet any more. Since addresses are deduced from lports, the ordering > of deleting/adding doesn't matter any more. > > How does this sound? Will we want sets of Logical_Router_Ports later? ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
> On 20 Feb 2018, at 19:50, Lucas Alvares Gomes wrote: > > Hi, > >> On Tue, Feb 20, 2018 at 4:56 PM, Han Zhou wrote: >> >> >>> On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: >>> On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez wrote: @Han, I can try rebase the patch if you want but that was basically renaming the Address_Set table and from Ben's comment, it may be better to keep the name. Not sure, however, how we can proceed to address Lucas' points in this thread. >>> >>> I wouldn't rename the table. It sounds like the priority should be to >>> add support for sets of port names. I thought that there was already a >>> patch for that to be rebased, but maybe I misunderstood. >> >> I feel it is better to add a new table for port group explicitly, and the >> column type can be a set of weak reference to Logical_Switch_Port. >> The benefits are: >> - Better data integrity: deleting a lport automatically deletes from the >> port group >> - No confusion about the type of records in a single table >> - Existing Address_Set mechanism will continue to be supported without any >> change >> - Furthermore, the race condition issue brought up by Lucas can be solved by >> supporting port-group in IP address match condition in ovn-controller, so >> that all addresses in the lports are used just like how AddressSet is used >> today. And there is no need for Neutron networking-ovn to use AddressSet any >> more. Since addresses are deduced from lports, the ordering of >> deleting/adding doesn't matter any more. >> >> How does this sound? +1 from me too! Even though I understand the reasons brought up by Ben I think this might be the right plan forward! We can try to work on it together Han. Thanks a lot guys. Daniel >> > > +1 from me, I quite like it! It would solve both problems raised in > this thread and won't break any existing code. > > (Also, thanks for all the previous suggestions @Ben, @Han and @Daniel. > This thread has been really helpful!) > > Cheers, > Lucas > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Hi, On Tue, Feb 20, 2018 at 4:56 PM, Han Zhou wrote: > > > On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: >> >> On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez wrote: >> > @Han, I can try rebase the patch if you want but that was >> > basically renaming the Address_Set table and from Ben's >> > comment, it may be better to keep the name. Not sure, >> > however, how we can proceed to address Lucas' points in >> > this thread. >> >> I wouldn't rename the table. It sounds like the priority should be to >> add support for sets of port names. I thought that there was already a >> patch for that to be rebased, but maybe I misunderstood. > > I feel it is better to add a new table for port group explicitly, and the > column type can be a set of weak reference to Logical_Switch_Port. > The benefits are: > - Better data integrity: deleting a lport automatically deletes from the > port group > - No confusion about the type of records in a single table > - Existing Address_Set mechanism will continue to be supported without any > change > - Furthermore, the race condition issue brought up by Lucas can be solved by > supporting port-group in IP address match condition in ovn-controller, so > that all addresses in the lports are used just like how AddressSet is used > today. And there is no need for Neutron networking-ovn to use AddressSet any > more. Since addresses are deduced from lports, the ordering of > deleting/adding doesn't matter any more. > > How does this sound? > +1 from me, I quite like it! It would solve both problems raised in this thread and won't break any existing code. (Also, thanks for all the previous suggestions @Ben, @Han and @Daniel. This thread has been really helpful!) Cheers, Lucas ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Tue, Feb 20, 2018 at 8:15 AM, Ben Pfaff wrote: > > On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez wrote: > > @Han, I can try rebase the patch if you want but that was > > basically renaming the Address_Set table and from Ben's > > comment, it may be better to keep the name. Not sure, > > however, how we can proceed to address Lucas' points in > > this thread. > > I wouldn't rename the table. It sounds like the priority should be to > add support for sets of port names. I thought that there was already a > patch for that to be rebased, but maybe I misunderstood. I feel it is better to add a new table for port group explicitly, and the column type can be a set of weak reference to Logical_Switch_Port. The benefits are: - Better data integrity: deleting a lport automatically deletes from the port group - No confusion about the type of records in a single table - Existing Address_Set mechanism will continue to be supported without any change - Furthermore, the race condition issue brought up by Lucas can be solved by supporting port-group in IP address match condition in ovn-controller, so that all addresses in the lports are used just like how AddressSet is used today. And there is no need for Neutron networking-ovn to use AddressSet any more. Since addresses are deduced from lports, the ordering of deleting/adding doesn't matter any more. How does this sound? Thanks, Han ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Mon, Feb 19, 2018 at 11:33:11AM +0100, Daniel Alvarez Sanchez wrote: > @Han, I can try rebase the patch if you want but that was > basically renaming the Address_Set table and from Ben's > comment, it may be better to keep the name. Not sure, > however, how we can proceed to address Lucas' points in > this thread. I wouldn't rename the table. It sounds like the priority should be to add support for sets of port names. I thought that there was already a patch for that to be rebased, but maybe I misunderstood. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Tue, Feb 13, 2018 at 8:32 PM, Ben Pfaff wrote: > On Tue, Feb 13, 2018 at 12:39:56PM +0100, Daniel Alvarez Sanchez wrote: > > Hi folks, > > > > As we're doing some performance tests in OpenStack using OVN, > > we noticed that as we keep creating ports, the time for creating a > > single port increases. Also, ovn-northd CPU consumption is quite > > high (see [0] which shows the CPU consumption when creating > > 1000 ports and deleting them. Last part where CPU is at 100% is > > when all the ports get deleted). > > > > With 500 ports in the same Logical Switch, I did some profiling > > of OpenStack neutron-server adding 10 more ports to that Logical > > Switch. Currently, neutron-server spawns different API workers > > (separate processes) which open connections to OVN NB so every > > time an update message is sent from ovsdb-server it'll be processed > > by all of them. > > > > In my profiling, I used GreenletProfiler in all those processes to > produce > > a trace file and then merged all of them together to aggregate the > > results. In those tests I used OVS master branch compiled it with > > shared libraries to make use of the JSON C parser. Still, I've seen > > that most of the time's been spent in the following two modules: > > > > - python/ovs/db/data.py: 33% > > - uuid.py: 21% > > > > For the data.py module, this is the usage (self time): > > > > Atom.__lt__ 16.25% 8283 calls > > from_json:118 6.18% 406935 calls > > Atom.__hash__ 3.48% 1623832 calls > > from_json:328 2.01% 5040 calls > > > > While for the uuid module: > > > > UUID.__cmp__ 12.84% 3570975 calls > > UUID.__init__ 4.06% 362541 calls > > UUID.__hash__ 2.96% 1800 calls > > UUID.__str__1.03% 355016 calls > > > > Most of the calls to Atom.__lt__ come from > > BaseOvnIdl.__process_update2(idl.py) > > -> BaseOvnIdl.__row_update(idl.py) -> Datum.__cmp__(data.py) -> > > Atom.__cmp__(data.py). > > I don't know Python well enough to whether these are "real" or correct > optimizations, but the following strike me as possible low-hanging fruit > micro-optimizations: > > diff --git a/python/ovs/db/data.py b/python/ovs/db/data.py > index 9e57595f7513..dc816f64708f 100644 > --- a/python/ovs/db/data.py > +++ b/python/ovs/db/data.py > @@ -76,12 +76,12 @@ class Atom(object): > def __eq__(self, other): > if not isinstance(other, Atom) or self.type != other.type: > return NotImplemented > -return True if self.value == other.value else False > +return self.value == other.value > > def __lt__(self, other): > if not isinstance(other, Atom) or self.type != other.type: > return NotImplemented > -return True if self.value < other.value else False > +return self.value < other.value > > def __cmp__(self, other): > if not isinstance(other, Atom) or self.type != other.type: > > We could also reduce the number of calls to Datum.__cmp__ by recognizing > that if we've already found one change we don't have to do any more > comparisons, something like this: > > diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py > index 60548bcf50b6..2e550adfdf1c 100644 > --- a/python/ovs/db/idl.py > +++ b/python/ovs/db/idl.py > @@ -447,6 +447,7 @@ class Idl(object): > raise error.Error(" is not an object", >table_updates) > > +changed = False > for table_name, table_update in six.iteritems(table_updates): > table = self.tables.get(table_name) > if not table: > @@ -472,8 +473,8 @@ class Idl(object): >% (table_name, uuid_string)) > > if version == OVSDB_UPDATE2: > -if self.__process_update2(table, uuid, row_update): > -self.change_seqno += 1 > +changed = self.__process_update2(table, uuid, > row_update, > + changed) > continue > > parser = ovs.db.parser.Parser(row_update, "row-update") > @@ -485,12 +486,12 @@ class Idl(object): > raise error.Error(' missing "old" and ' >'"new" members', row_update) > > -if self.__process_update(table, uuid, old, new): > -self.change_seqno += 1 > +changed = self.__process_update(table, uuid, old, new, > changed) > +if changed: > +self.change_seqno += 1 > > -def __process_update2(self, table, uuid, row_update): > +def __process_update2(self, table, uuid, row_update, changed): > row = table.rows.get(uuid) > -changed = False > if "delete" in row_update: > if row: > del table.rows[uuid] > @@ -511,9 +512,8 @@ class Idl(object): > else: > row_update = row_update['initial
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Just for completeness, I did some tests with and without the JSON parser C extension [0]. There's no significant gain (at this point), maybe when we add the port groups it's more noticeable though. Average time without it's been 2.09s while with the C extension it's been 2.03s. @Han, I can try rebase the patch if you want but that was basically renaming the Address_Set table and from Ben's comment, it may be better to keep the name. Not sure, however, how we can proceed to address Lucas' points in this thread. Thanks, Daniel [0] https://imgur.com/a/etb5M On Fri, Feb 16, 2018 at 6:33 PM, Han Zhou wrote: > Hi Daniel, > > Thanks for the detailed profiling! > > On Fri, Feb 16, 2018 at 6:50 AM, Daniel Alvarez Sanchez < > dalva...@redhat.com> wrote: > > > > About the duplicated processing of the update2 messages, I've verified > that those are not always present. I've isolated the scenario further and > did tcpdump and debugging on the exact process which is sending > the'transact' command and I see no update2 processing duplicates. Among the > rest of the workers, one of them is always getting them duplicated while > the rest don't I don't know why. > > However, the 'modify' in the LS table for updating the acls set is the > one always taking 2-3 seconds on this load. > > > I think this explains why the time spent grows linearly with number of > lports. Each lport has its own ACLs added to same LS, so the acl column in > the row gets bigger and bigger. It is likely that the port group > optimization would solve this problem. > > Thanks > Han > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Hi Daniel, Thanks for the detailed profiling! On Fri, Feb 16, 2018 at 6:50 AM, Daniel Alvarez Sanchez wrote: > > About the duplicated processing of the update2 messages, I've verified that those are not always present. I've isolated the scenario further and did tcpdump and debugging on the exact process which is sending the'transact' command and I see no update2 processing duplicates. Among the rest of the workers, one of them is always getting them duplicated while the rest don't I don't know why. > However, the 'modify' in the LS table for updating the acls set is the one always taking 2-3 seconds on this load. > I think this explains why the time spent grows linearly with number of lports. Each lport has its own ACLs added to same LS, so the acl column in the row gets bigger and bigger. It is likely that the port group optimization would solve this problem. Thanks Han ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
About the duplicated processing of the update2 messages, I've verified that those are not always present. I've isolated the scenario further and did tcpdump and debugging on the exact process which is sending the'transact' command and I see no update2 processing duplicates. Among the rest of the workers, one of them is always getting them duplicated while the rest don't I don't know why. However, the 'modify' in the LS table for updating the acls set is the one always taking 2-3 seconds on this load. On Fri, Feb 16, 2018 at 12:27 PM, Daniel Alvarez Sanchez < dalva...@redhat.com> wrote: > > > On Fri, Feb 16, 2018 at 12:12 PM, Daniel Alvarez Sanchez < > dalva...@redhat.com> wrote: > >> I've found out more about what is running slow in this scenario. >> I've profiled the processing of the update2 messages and here you can >> see the sequence of calls to __process_update2 (idl.py) when I'm >> creating a new port via OpenStack on a system loaded with 800 ports >> on the same Logical Switch: >> >> >> 1. Logical_Switch_Port 'insert' >> 2. Address_Set 'modify' (to add the new address, it takes ~ takes around >> 0.2 seconds >> 3. Logical_Switch_Port 'modify' (to add the new 8 ACLs) <- This takes > >> 2 seconds >> > Sorry this is ^ Logical_Switch table. > >> 4. ACL 'insert' x8 (one per ACL) >> 5. Logical_Switch_Port 'modify' ('up' = False) >> 6. Logical_Switch_Port 'insert' (this is exactly the same as 1, so it'll >> be skipped) >> 7. Address_Set 'modify' (this is exactly the same as 2, so it'll be >> skipped) still takes ~0.05-0.01 s >> 8. Logical_Switch 'modify' (to add the new 8 ACLs, same as 3) still >> takes ~0.5 seconds >> > This too ^ > >> 9. ACL 'insert' x8 (one per ACL, same as 4) >> 10. Logical_Switch_Port 'modify' ('up' = False) same as 5 >> 11. Port_Binding (SB) 'insert' >> 12. Port_Binding (SB) 'insert' (same as 11) >> >> >> Half of those are dups and even they are noop, they consume times. >> The most expensive operation is adding the new 8 ACLs to the acls set in >> LS table. >> (800 ports with 8 ACLs each makes that set to have 6400 elements). >> >> NOTE: As you can see, we're trying to insert the address again into >> Address_Sets so we should >> bear this in mind if we go ahead with Lucas' suggestion about allowing >> dups here. >> >> It's obvious that we'll gain a lot by using Ports Sets for ACLs but, >> we'll also need >> to invest some time in finding out why we're getting dups and also trying >> to optimize >> the process_update2 method and its callees to make it faster. With those >> last >> two things I guess we can improve the performance a lot. >> Also, creating a python C binding for this module could also help but >> that seems like >> a lot of work and still we would need to convert to/from C structures to >> Python >> objects. However, inserting or identifying dups on large sets would be >> way faster. >> >> I'm going to try out Ben's suggestions for optimizing the process_update* >> methods, >> and will also try to dig further about the dups. As process_update* seems >> a bit >> expensive, looks to me that calling it 26 times for a single port is a >> lot. >> 26 calls = 2*(1 (LS insert) + 1 (AS modify) + 1(LSP modify) + 8 (ACL >> insert) + 1(LSP modify) + 1(PB insert)) >> > 26 calls = 2*(1 (LS insert) + 1 (AS modify) + 1(LS modify) + 8 (ACL > insert) + 1(LSP modify) + 1(PB insert)) > > >> Thoughts? >> Thanks! >> Daniel >> >> >> On Thu, Feb 15, 2018 at 10:56 PM, Daniel Alvarez Sanchez < >> dalva...@redhat.com> wrote: >> >>> >>> >>> On Wed, Feb 14, 2018 at 9:34 PM, Han Zhou wrote: >>> On Wed, Feb 14, 2018 at 9:45 AM, Ben Pfaff wrote: > > On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: > > Thanks for your inputs. I need to look more carefully into the patch you > > submitted but it looks like, at least, we'll be reducing the number of > > calls to Datum.__cmp__ which should be good. > > Thanks. Please do take a look. It's a micro-optimization but maybe > it'll help? > > > I probably didn't explain it very well. Right now we have N processes > > for Neutron server (in every node). Each of those opens a connection > > to NB db and they subscribe to updates from certain tables. Each time > > a change happens, ovsdb-server will send N update2 messages that has > > to be processed in this "expensive" way by each of those N > > processes. My proposal (yet to be refined) would be to now open N+1 > > connections to ovsdb-server and only subscribe to notifications from 1 > > of those. So every time a new change happens, ovsdb-server will send 1 > > update2 message. This message will be processed (using Py IDL as we do > > now) and once processed, send it (mcast maybe?) to the rest N > > processes. This msg could be simply a Python object serialized and > > we'd be saving all this Datum, Atom, etc. process
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Fri, Feb 16, 2018 at 12:12 PM, Daniel Alvarez Sanchez < dalva...@redhat.com> wrote: > I've found out more about what is running slow in this scenario. > I've profiled the processing of the update2 messages and here you can > see the sequence of calls to __process_update2 (idl.py) when I'm > creating a new port via OpenStack on a system loaded with 800 ports > on the same Logical Switch: > > > 1. Logical_Switch_Port 'insert' > 2. Address_Set 'modify' (to add the new address, it takes ~ takes around > 0.2 seconds > 3. Logical_Switch_Port 'modify' (to add the new 8 ACLs) <- This takes > 2 > seconds > Sorry this is ^ Logical_Switch table. > 4. ACL 'insert' x8 (one per ACL) > 5. Logical_Switch_Port 'modify' ('up' = False) > 6. Logical_Switch_Port 'insert' (this is exactly the same as 1, so it'll > be skipped) > 7. Address_Set 'modify' (this is exactly the same as 2, so it'll be > skipped) still takes ~0.05-0.01 s > 8. Logical_Switch 'modify' (to add the new 8 ACLs, same as 3) still takes > ~0.5 seconds > This too ^ > 9. ACL 'insert' x8 (one per ACL, same as 4) > 10. Logical_Switch_Port 'modify' ('up' = False) same as 5 > 11. Port_Binding (SB) 'insert' > 12. Port_Binding (SB) 'insert' (same as 11) > > > Half of those are dups and even they are noop, they consume times. > The most expensive operation is adding the new 8 ACLs to the acls set in > LS table. > (800 ports with 8 ACLs each makes that set to have 6400 elements). > > NOTE: As you can see, we're trying to insert the address again into > Address_Sets so we should > bear this in mind if we go ahead with Lucas' suggestion about allowing > dups here. > > It's obvious that we'll gain a lot by using Ports Sets for ACLs but, we'll > also need > to invest some time in finding out why we're getting dups and also trying > to optimize > the process_update2 method and its callees to make it faster. With those > last > two things I guess we can improve the performance a lot. > Also, creating a python C binding for this module could also help but that > seems like > a lot of work and still we would need to convert to/from C structures to > Python > objects. However, inserting or identifying dups on large sets would be way > faster. > > I'm going to try out Ben's suggestions for optimizing the process_update* > methods, > and will also try to dig further about the dups. As process_update* seems > a bit > expensive, looks to me that calling it 26 times for a single port is a lot. > 26 calls = 2*(1 (LS insert) + 1 (AS modify) + 1(LSP modify) + 8 (ACL > insert) + 1(LSP modify) + 1(PB insert)) > 26 calls = 2*(1 (LS insert) + 1 (AS modify) + 1(LS modify) + 8 (ACL insert) + 1(LSP modify) + 1(PB insert)) > Thoughts? > Thanks! > Daniel > > > On Thu, Feb 15, 2018 at 10:56 PM, Daniel Alvarez Sanchez < > dalva...@redhat.com> wrote: > >> >> >> On Wed, Feb 14, 2018 at 9:34 PM, Han Zhou wrote: >> >>> >>> >>> On Wed, Feb 14, 2018 at 9:45 AM, Ben Pfaff wrote: >>> > >>> > On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: >>> > > Thanks for your inputs. I need to look more carefully into the patch >>> you >>> > > submitted but it looks like, at least, we'll be reducing the number >>> of >>> > > calls to Datum.__cmp__ which should be good. >>> > >>> > Thanks. Please do take a look. It's a micro-optimization but maybe >>> > it'll help? >>> > >>> > > I probably didn't explain it very well. Right now we have N processes >>> > > for Neutron server (in every node). Each of those opens a connection >>> > > to NB db and they subscribe to updates from certain tables. Each time >>> > > a change happens, ovsdb-server will send N update2 messages that has >>> > > to be processed in this "expensive" way by each of those N >>> > > processes. My proposal (yet to be refined) would be to now open N+1 >>> > > connections to ovsdb-server and only subscribe to notifications from >>> 1 >>> > > of those. So every time a new change happens, ovsdb-server will send >>> 1 >>> > > update2 message. This message will be processed (using Py IDL as we >>> do >>> > > now) and once processed, send it (mcast maybe?) to the rest N >>> > > processes. This msg could be simply a Python object serialized and >>> > > we'd be saving all this Datum, Atom, etc. processing by doing it just >>> > > once. >>> > >>> Daniel, I understand that the update2 messages sending would consume NB >>> ovsdb-server CPU and processing those update would consume neutron server >>> process CPU. However, are we sure it is the bottleneck for port creation? >>> >>> From ovsdb-server point of view, sending updates to tens of clients >>> should not be the bottleneck, considering that we have a lot more clients >>> on HVs for SB ovsdb-server. >>> >>> From clients point of view, I think it is more of memory overhead than >>> CPU, and it also depends on how many neutron processes are running on the >>> same node. I didn't find neutron process CPU in your charts. I am hesitate >>> for such big change before we are cle
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
I've found out more about what is running slow in this scenario. I've profiled the processing of the update2 messages and here you can see the sequence of calls to __process_update2 (idl.py) when I'm creating a new port via OpenStack on a system loaded with 800 ports on the same Logical Switch: 1. Logical_Switch_Port 'insert' 2. Address_Set 'modify' (to add the new address, it takes ~ takes around 0.2 seconds 3. Logical_Switch_Port 'modify' (to add the new 8 ACLs) <- This takes > 2 seconds 4. ACL 'insert' x8 (one per ACL) 5. Logical_Switch_Port 'modify' ('up' = False) 6. Logical_Switch_Port 'insert' (this is exactly the same as 1, so it'll be skipped) 7. Address_Set 'modify' (this is exactly the same as 2, so it'll be skipped) still takes ~0.05-0.01 s 8. Logical_Switch_Port 'modify' (to add the new 8 ACLs, same as 3) still takes ~0.5 seconds 9. ACL 'insert' x8 (one per ACL, same as 4) 10. Logical_Switch_Port 'modify' ('up' = False) same as 5 11. Port_Binding (SB) 'insert' 12. Port_Binding (SB) 'insert' (same as 11) Half of those are dups and even they are noop, they consume times. The most expensive operation is adding the new 8 ACLs to the acls set in LS table. (800 ports with 8 ACLs each makes that set to have 6400 elements). NOTE: As you can see, we're trying to insert the address again into Address_Sets so we should bear this in mind if we go ahead with Lucas' suggestion about allowing dups here. It's obvious that we'll gain a lot by using Ports Sets for ACLs but, we'll also need to invest some time in finding out why we're getting dups and also trying to optimize the process_update2 method and its callees to make it faster. With those last two things I guess we can improve the performance a lot. Also, creating a python C binding for this module could also help but that seems like a lot of work and still we would need to convert to/from C structures to Python objects. However, inserting or identifying dups on large sets would be way faster. I'm going to try out Ben's suggestions for optimizing the process_update* methods, and will also try to dig further about the dups. As process_update* seems a bit expensive, looks to me that calling it 26 times for a single port is a lot. 26 calls = 2*(1 (LS insert) + 1 (AS modify) + 1(LSP modify) + 8 (ACL insert) + 1(LSP modify) + 1(PB insert)) Thoughts? Thanks! Daniel On Thu, Feb 15, 2018 at 10:56 PM, Daniel Alvarez Sanchez < dalva...@redhat.com> wrote: > > > On Wed, Feb 14, 2018 at 9:34 PM, Han Zhou wrote: > >> >> >> On Wed, Feb 14, 2018 at 9:45 AM, Ben Pfaff wrote: >> > >> > On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: >> > > Thanks for your inputs. I need to look more carefully into the patch >> you >> > > submitted but it looks like, at least, we'll be reducing the number of >> > > calls to Datum.__cmp__ which should be good. >> > >> > Thanks. Please do take a look. It's a micro-optimization but maybe >> > it'll help? >> > >> > > I probably didn't explain it very well. Right now we have N processes >> > > for Neutron server (in every node). Each of those opens a connection >> > > to NB db and they subscribe to updates from certain tables. Each time >> > > a change happens, ovsdb-server will send N update2 messages that has >> > > to be processed in this "expensive" way by each of those N >> > > processes. My proposal (yet to be refined) would be to now open N+1 >> > > connections to ovsdb-server and only subscribe to notifications from 1 >> > > of those. So every time a new change happens, ovsdb-server will send 1 >> > > update2 message. This message will be processed (using Py IDL as we do >> > > now) and once processed, send it (mcast maybe?) to the rest N >> > > processes. This msg could be simply a Python object serialized and >> > > we'd be saving all this Datum, Atom, etc. processing by doing it just >> > > once. >> > >> Daniel, I understand that the update2 messages sending would consume NB >> ovsdb-server CPU and processing those update would consume neutron server >> process CPU. However, are we sure it is the bottleneck for port creation? >> >> From ovsdb-server point of view, sending updates to tens of clients >> should not be the bottleneck, considering that we have a lot more clients >> on HVs for SB ovsdb-server. >> >> From clients point of view, I think it is more of memory overhead than >> CPU, and it also depends on how many neutron processes are running on the >> same node. I didn't find neutron process CPU in your charts. I am hesitate >> for such big change before we are clear about the bottleneck. The chart of >> port creation time is very nice, but do we know which part of code >> contributed to the linear growth? Do we have profiling for the time spent >> in ovn_client.add_acls()? >> > > Here we are [0]. We see some spikes which are larger as the amount of > ports increases > but looks like the actual bottleneck is going to be when we're actually > commiting the > transaction [1]. I'll dig fur
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Thu, Feb 15, 2018 at 01:38:04PM -0800, Han Zhou wrote: > On Thu, Feb 15, 2018 at 12:56 PM, Ben Pfaff wrote: > > > > On Thu, Feb 15, 2018 at 06:50:15PM +, Lucas Alvares Gomes wrote: > > > Hi all, > > > > > > We currently have a problem with Address_Set in networking-ovn (and > > > potentially other technologies using OVN as backend) which *maybe* > > > could be fixed together with this idea of a new "port set" (a.k.a > > > macro set). > > > > > > The problem is bit tricky but it shows as a race condition between > > > creating and deleting ports in Neutron. That is, deleting a port in > > > Neutron will result in the IP address of that port to be removed from > > > an address set, but at the same time, if another request to create a > > > port overtaking the same IP address of the port being deleted is > > > issued, depending on which order things are committed it could result > > > in the address set to not contain the new port IP address. > > > > > > Here's a bug ticket describing the problem in more detail: > > > https://bugs.launchpad.net/networking-ovn/+bug/1611852 > > > > > > The reason why it happens is because we don't have a direct > > > relationship between the addresses in an address set and the ports > > > that those addresses belongs to. So, if we create this relation we > > > might end up having both ports and addresses present in the > > > Address_Set table which then can be used to fix both of our problems. > > > > It seems very odd to me that Neutron can commit things out of order. If > > the OVSDB schema for Address_Set were slightly different, this would > > even be a constraint violation that would cause the create operation to > > fail if it were executed before the delete operation. I wouldn't be > > surprised if other operations could be reordered in a way that would > > cause such a failure. I'll be disappointed if we solve this specific > > problem and then multiple other examples of the same general problem > > appear. > > > > Here is the best idea that has come to my mind so far: use key-value > > pairs in the "external-ids" column to indicate which port owns which > > address. At time of port insertion, Neutron add the key; at time of > > port deletion it only removes an address if the deleted port owned it. > > (This is doable in an atomic fashion with the existing OVSDB protocol, > > or at least it is possible to abort the transaction if the deleted port > > does not own it.) > > > The external-ids approach would require going through the whole > external-ids list to see if there is any other owners for the same address > when we are trying to delete one, which may be inefficient. Also it seems > to introduce too much redundant data, since we are repeating the addresses > in external-ids. I don't think that this argument about inefficiency is likely to be true; it's simply an in-memory search. I find the latter argument more persuasive, since redundancy often indicates weakness in a design. > > > Here's some ideas: > > > > > > # 1. A new map column > > > > > > We could add a "map" type column called "ports" in the Address_Set > > > table that would look like this in the database: > > > > > > "Address_Set": { > > > "ports": {"port-1-name": {"ipv4_addresses": [...], > > > "ipv6_addresses": [...}} > > > ... > > > } > > > > This particular solution seems to me like it solves a very specific > > problem. I'd rather solve a more general problem if we can. > > > > > # 2: Add a new way to populate the address set: > > > > > > Instead of directly populating the addresses in an address set where > > > the port relationship isn't clear, we could add two list of ports > > > references (one for each IP version) and have the addresses > > > automatically populated. > > > > > > For example: > > > > > > "Address_Set": { > > > "columns": { > > > "ipv4_ports": {"type": {"key": {"type": "uuid", > > > "refTable": "Logical_Switch_Port", > > > "refType": "weak"}, > > > "min": 0, > > > "max": "unlimited"}} > > > "ipv6_ports": {"type": {"key": {"type": "uuid", > > > "refTable": "Logical_Switch_Port", > > > "refType": "weak"}, > > > "min": 0, > > > "max": "unlimited"}} > > > ... > > > } > > > > > > The problem here is that we would pull all addresses from those ports > > > into the address set. > > > > > > The good part is that since it's a weak reference, deleting a port > > > would automatically remove it from the address set. > > > > This is creative and I appreciate it. It also seems very specific to > > this particular problem. > > > #1 and #2 make the assumption that all addresses in a set belong to a > lport, which is not always true (it may be true in current Neutron, > t
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 9:34 PM, Han Zhou wrote: > > > On Wed, Feb 14, 2018 at 9:45 AM, Ben Pfaff wrote: > > > > On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: > > > Thanks for your inputs. I need to look more carefully into the patch > you > > > submitted but it looks like, at least, we'll be reducing the number of > > > calls to Datum.__cmp__ which should be good. > > > > Thanks. Please do take a look. It's a micro-optimization but maybe > > it'll help? > > > > > I probably didn't explain it very well. Right now we have N processes > > > for Neutron server (in every node). Each of those opens a connection > > > to NB db and they subscribe to updates from certain tables. Each time > > > a change happens, ovsdb-server will send N update2 messages that has > > > to be processed in this "expensive" way by each of those N > > > processes. My proposal (yet to be refined) would be to now open N+1 > > > connections to ovsdb-server and only subscribe to notifications from 1 > > > of those. So every time a new change happens, ovsdb-server will send 1 > > > update2 message. This message will be processed (using Py IDL as we do > > > now) and once processed, send it (mcast maybe?) to the rest N > > > processes. This msg could be simply a Python object serialized and > > > we'd be saving all this Datum, Atom, etc. processing by doing it just > > > once. > > > Daniel, I understand that the update2 messages sending would consume NB > ovsdb-server CPU and processing those update would consume neutron server > process CPU. However, are we sure it is the bottleneck for port creation? > > From ovsdb-server point of view, sending updates to tens of clients should > not be the bottleneck, considering that we have a lot more clients on HVs > for SB ovsdb-server. > > From clients point of view, I think it is more of memory overhead than > CPU, and it also depends on how many neutron processes are running on the > same node. I didn't find neutron process CPU in your charts. I am hesitate > for such big change before we are clear about the bottleneck. The chart of > port creation time is very nice, but do we know which part of code > contributed to the linear growth? Do we have profiling for the time spent > in ovn_client.add_acls()? > Here we are [0]. We see some spikes which are larger as the amount of ports increases but looks like the actual bottleneck is going to be when we're actually commiting the transaction [1]. I'll dig further though. [0 https://imgur.com/a/TmwbC [1] https://github.com/openvswitch/ovs/blob/master/python/ovs/db/idl.py#L1158 > > > OK. It's an optimization that does the work in one place rather than N > > places, so definitely a win from a CPU cost point of view, but it trades > > performance for increased complexity. It sounds like performance is > > really important so maybe the increased complexity is a fair trade. > > > > We might also be able to improve performance by using native code for > > some of the work. Were these tests done with the native code JSON > > parser that comes with OVS? It is dramatically faster than the Python > > code. > > > > > On Tue, Feb 13, 2018 at 8:32 PM, Ben Pfaff wrote: > > > > > > > Can you sketch the rows that are being inserted or modified when a > port > > > > is added? I would expect something like this as a minimum: > > > > > > > > * Insert one Logical_Switch_Port row. > > > > > > > > * Add pointer to Logical_Switch_Port to ports column in one > row > > > > in Logical_Switch. > > > > > > > > In addition it sounds like currently we're seeing: > > > > > > > > * Add one ACL row per security group rule. > > > > > > > > * Add pointers to ACL rows to acls column in one row in > > > > Logical_Switch. > > > > > > > This is what happens when we create a port in OpenStack (without > > > binding it) which belongs to a SG which allows ICMP and SSH traffic > > > and drops the rest [0] > > > > > > Basically, you were right and only thing missing was adding the new > > > address to the Address_Set table. > > > > OK. > > > > It sounds like the real scaling problem here is that for R security > > group rules and P ports, we have R*P rows in the ACL table. Is that > > correct? Should we aim to solve that problem? > > I think this might be the most valuable point to optimize for the > create_port scenario from Neutron. > I remember there was a patch for ACL group in OVN, so that instead of R*P > rows we will have only R + P rows, but didn't see it went through. > Is this also a good use case of conjuncture? > > > ___ > > discuss mailing list > > disc...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Thu, Feb 15, 2018 at 12:56 PM, Ben Pfaff wrote: > > On Thu, Feb 15, 2018 at 06:50:15PM +, Lucas Alvares Gomes wrote: > > Hi all, > > > > We currently have a problem with Address_Set in networking-ovn (and > > potentially other technologies using OVN as backend) which *maybe* > > could be fixed together with this idea of a new "port set" (a.k.a > > macro set). > > > > The problem is bit tricky but it shows as a race condition between > > creating and deleting ports in Neutron. That is, deleting a port in > > Neutron will result in the IP address of that port to be removed from > > an address set, but at the same time, if another request to create a > > port overtaking the same IP address of the port being deleted is > > issued, depending on which order things are committed it could result > > in the address set to not contain the new port IP address. > > > > Here's a bug ticket describing the problem in more detail: > > https://bugs.launchpad.net/networking-ovn/+bug/1611852 > > > > The reason why it happens is because we don't have a direct > > relationship between the addresses in an address set and the ports > > that those addresses belongs to. So, if we create this relation we > > might end up having both ports and addresses present in the > > Address_Set table which then can be used to fix both of our problems. > > It seems very odd to me that Neutron can commit things out of order. If > the OVSDB schema for Address_Set were slightly different, this would > even be a constraint violation that would cause the create operation to > fail if it were executed before the delete operation. I wouldn't be > surprised if other operations could be reordered in a way that would > cause such a failure. I'll be disappointed if we solve this specific > problem and then multiple other examples of the same general problem > appear. > > Here is the best idea that has come to my mind so far: use key-value > pairs in the "external-ids" column to indicate which port owns which > address. At time of port insertion, Neutron add the key; at time of > port deletion it only removes an address if the deleted port owned it. > (This is doable in an atomic fashion with the existing OVSDB protocol, > or at least it is possible to abort the transaction if the deleted port > does not own it.) > The external-ids approach would require going through the whole external-ids list to see if there is any other owners for the same address when we are trying to delete one, which may be inefficient. Also it seems to introduce too much redundant data, since we are repeating the addresses in external-ids. > > Here's some ideas: > > > > # 1. A new map column > > > > We could add a "map" type column called "ports" in the Address_Set > > table that would look like this in the database: > > > > "Address_Set": { > > "ports": {"port-1-name": {"ipv4_addresses": [...], > > "ipv6_addresses": [...}} > > ... > > } > > This particular solution seems to me like it solves a very specific > problem. I'd rather solve a more general problem if we can. > > > # 2: Add a new way to populate the address set: > > > > Instead of directly populating the addresses in an address set where > > the port relationship isn't clear, we could add two list of ports > > references (one for each IP version) and have the addresses > > automatically populated. > > > > For example: > > > > "Address_Set": { > > "columns": { > > "ipv4_ports": {"type": {"key": {"type": "uuid", > > "refTable": "Logical_Switch_Port", > > "refType": "weak"}, > > "min": 0, > > "max": "unlimited"}} > > "ipv6_ports": {"type": {"key": {"type": "uuid", > > "refTable": "Logical_Switch_Port", > > "refType": "weak"}, > > "min": 0, > > "max": "unlimited"}} > > ... > > } > > > > The problem here is that we would pull all addresses from those ports > > into the address set. > > > > The good part is that since it's a weak reference, deleting a port > > would automatically remove it from the address set. > > This is creative and I appreciate it. It also seems very specific to > this particular problem. > #1 and #2 make the assumption that all addresses in a set belong to a lport, which is not always true (it may be true in current Neutron, though). We should not prevent user from using address set without creating lport for the addresses. > > # 3: Allow duplicated addresses in the list > > > > If the above options sounds too complicated, maybe we could keep the > > idea of this email of creating a "Macro_Set" that could be used for > > both ports and addresses types [0]. But, when the type is set to > > "address" we could allow duplicated items in the list of elements that > > way we won't have a pr
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Thu, Feb 15, 2018 at 06:50:15PM +, Lucas Alvares Gomes wrote: > Hi all, > > We currently have a problem with Address_Set in networking-ovn (and > potentially other technologies using OVN as backend) which *maybe* > could be fixed together with this idea of a new "port set" (a.k.a > macro set). > > The problem is bit tricky but it shows as a race condition between > creating and deleting ports in Neutron. That is, deleting a port in > Neutron will result in the IP address of that port to be removed from > an address set, but at the same time, if another request to create a > port overtaking the same IP address of the port being deleted is > issued, depending on which order things are committed it could result > in the address set to not contain the new port IP address. > > Here's a bug ticket describing the problem in more detail: > https://bugs.launchpad.net/networking-ovn/+bug/1611852 > > The reason why it happens is because we don't have a direct > relationship between the addresses in an address set and the ports > that those addresses belongs to. So, if we create this relation we > might end up having both ports and addresses present in the > Address_Set table which then can be used to fix both of our problems. It seems very odd to me that Neutron can commit things out of order. If the OVSDB schema for Address_Set were slightly different, this would even be a constraint violation that would cause the create operation to fail if it were executed before the delete operation. I wouldn't be surprised if other operations could be reordered in a way that would cause such a failure. I'll be disappointed if we solve this specific problem and then multiple other examples of the same general problem appear. Here is the best idea that has come to my mind so far: use key-value pairs in the "external-ids" column to indicate which port owns which address. At time of port insertion, Neutron add the key; at time of port deletion it only removes an address if the deleted port owned it. (This is doable in an atomic fashion with the existing OVSDB protocol, or at least it is possible to abort the transaction if the deleted port does not own it.) > Here's some ideas: > > # 1. A new map column > > We could add a "map" type column called "ports" in the Address_Set > table that would look like this in the database: > > "Address_Set": { > "ports": {"port-1-name": {"ipv4_addresses": [...], > "ipv6_addresses": [...}} > ... > } This particular solution seems to me like it solves a very specific problem. I'd rather solve a more general problem if we can. > # 2: Add a new way to populate the address set: > > Instead of directly populating the addresses in an address set where > the port relationship isn't clear, we could add two list of ports > references (one for each IP version) and have the addresses > automatically populated. > > For example: > > "Address_Set": { > "columns": { > "ipv4_ports": {"type": {"key": {"type": "uuid", > "refTable": "Logical_Switch_Port", > "refType": "weak"}, > "min": 0, > "max": "unlimited"}} > "ipv6_ports": {"type": {"key": {"type": "uuid", > "refTable": "Logical_Switch_Port", > "refType": "weak"}, > "min": 0, > "max": "unlimited"}} > ... > } > > The problem here is that we would pull all addresses from those ports > into the address set. > > The good part is that since it's a weak reference, deleting a port > would automatically remove it from the address set. This is creative and I appreciate it. It also seems very specific to this particular problem. > # 3: Allow duplicated addresses in the list > > If the above options sounds too complicated, maybe we could keep the > idea of this email of creating a "Macro_Set" that could be used for > both ports and addresses types [0]. But, when the type is set to > "address" we could allow duplicated items in the list of elements that > way we won't have a problem if one transaction removes one duplicated > element in the list. > > [0] https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html I am closer to being comfortable with this solution. The specific solution cannot work because OVSDB doesn't support duplicates in its "set" types. We could extend the current Address_Set to resemble this solution by removing the constraint that the address set's name be unique and making ovn-northd and ovn-controller merge addresses from rows with duplicate names. Neutron could use the "external-ids" column to track which port owns which row. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Hi all, We currently have a problem with Address_Set in networking-ovn (and potentially other technologies using OVN as backend) which *maybe* could be fixed together with this idea of a new "port set" (a.k.a macro set). The problem is bit tricky but it shows as a race condition between creating and deleting ports in Neutron. That is, deleting a port in Neutron will result in the IP address of that port to be removed from an address set, but at the same time, if another request to create a port overtaking the same IP address of the port being deleted is issued, depending on which order things are committed it could result in the address set to not contain the new port IP address. Here's a bug ticket describing the problem in more detail: https://bugs.launchpad.net/networking-ovn/+bug/1611852 The reason why it happens is because we don't have a direct relationship between the addresses in an address set and the ports that those addresses belongs to. So, if we create this relation we might end up having both ports and addresses present in the Address_Set table which then can be used to fix both of our problems. Here's some ideas: # 1. A new map column We could add a "map" type column called "ports" in the Address_Set table that would look like this in the database: "Address_Set": { "ports": {"port-1-name": {"ipv4_addresses": [...], "ipv6_addresses": [...}} ... } # 2: Add a new way to populate the address set: Instead of directly populating the addresses in an address set where the port relationship isn't clear, we could add two list of ports references (one for each IP version) and have the addresses automatically populated. For example: "Address_Set": { "columns": { "ipv4_ports": {"type": {"key": {"type": "uuid", "refTable": "Logical_Switch_Port", "refType": "weak"}, "min": 0, "max": "unlimited"}} "ipv6_ports": {"type": {"key": {"type": "uuid", "refTable": "Logical_Switch_Port", "refType": "weak"}, "min": 0, "max": "unlimited"}} ... } The problem here is that we would pull all addresses from those ports into the address set. The good part is that since it's a weak reference, deleting a port would automatically remove it from the address set. # 3: Allow duplicated addresses in the list If the above options sounds too complicated, maybe we could keep the idea of this email of creating a "Macro_Set" that could be used for both ports and addresses types [0]. But, when the type is set to "address" we could allow duplicated items in the list of elements that way we won't have a problem if one transaction removes one duplicated element in the list. [0] https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html Any more ideas? Cheers, Lucas On Thu, Feb 15, 2018 at 12:04 AM, Ben Pfaff wrote: > On Thu, Feb 15, 2018 at 12:36:35AM +0100, Daniel Alvarez Sanchez wrote: >> If we would have the Port_Set we could simply write the match part as >> "outport == $security_group1 && ip4 && ip4.src == 0.0.0.0/0 && tcp && >> tcp.dst == 22" >> and reduce the number of ACLs to 1 per security group rule instead of 1 per >> security >> group rule per port as it is right now. As you can see, we're referencing >> the relevant security group rule in the CMS through the >> neutron:security_group_rule_id >> key in the external_ids column so we would reduce all ACLs which correspond >> to >> the same SG rule to just 1. > > OK, that matches up with what Han says. Han is going to rebase the > "port set" patches, so I think we'll have a solution for this soon after > that. > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Thu, Feb 15, 2018 at 12:36:35AM +0100, Daniel Alvarez Sanchez wrote: > If we would have the Port_Set we could simply write the match part as > "outport == $security_group1 && ip4 && ip4.src == 0.0.0.0/0 && tcp && > tcp.dst == 22" > and reduce the number of ACLs to 1 per security group rule instead of 1 per > security > group rule per port as it is right now. As you can see, we're referencing > the relevant security group rule in the CMS through the > neutron:security_group_rule_id > key in the external_ids column so we would reduce all ACLs which correspond > to > the same SG rule to just 1. OK, that matches up with what Han says. Han is going to rebase the "port set" patches, so I think we'll have a solution for this soon after that. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 03:45:21PM -0800, Han Zhou wrote: > On Wed, Feb 14, 2018 at 3:39 PM, Ben Pfaff wrote: > > > > On Wed, Feb 14, 2018 at 03:29:34PM -0800, Han Zhou wrote: > > > On Wed, Feb 14, 2018 at 3:08 PM, Ben Pfaff wrote: > > > > > > > > On Wed, Feb 14, 2018 at 02:25:56PM -0800, Han Zhou wrote: > > > > > On Wed, Feb 14, 2018 at 1:40 PM, Ben Pfaff wrote: > > > > > > > > > > > > On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > > > > > > > I remember there was a patch for ACL group in OVN, so that > instead > > > of > > > > > R*P > > > > > > > rows we will have only R + P rows, but didn't see it went > through. > > > > > > > > > > > > I don't remember that. Any chance you could point me to it? > > > > > > > > > > Yes, I found it: > > > > > > > > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html > > > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/321165.html > > > > > > > > > > And I made a mistake in my previous text. It is about port group, > which > > > is > > > > > what we need here, rather than ACL group. > > > > > > > > I guess what I'd like to see is an example of the problem that we're > > > > trying to solve here: what does a typical ACL row for a security group > > > > look like, and what parts of the row differ between its instance for > one > > > > port and another port? > > > > > > An ACL for a Neutron SG rule: ingress tcp dport=22, is something like: > > > to-lport 1000 "outport==\"\" && ip4 && tcp && > > > tcp.dst==22" allow-related > > > > > > All ports bound to the same SG will have an ACL like this, and the only > > > difference between one port and another is the part. > > > > Well, then it's really easy to merge all of them if we accept Zong's > > patches above or something similar. I had no idea! > > > > Is anyone willing to rebase those patches against current master? I had > > some feedback on them that wasn't ever addressed (and I'd probably have > > a little more), which is the only reason that they weren't committed as > > far as I can tell. > > Cool. I'd be happy to rebase them, probably next week. Thanks. I think that we shouldn't change the name of the table, by the way. Address_Set is somewhat of a misnomer once we allow sets of ports, but it's good enough and changing table names is a pain. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 3:39 PM, Ben Pfaff wrote: > > On Wed, Feb 14, 2018 at 03:29:34PM -0800, Han Zhou wrote: > > On Wed, Feb 14, 2018 at 3:08 PM, Ben Pfaff wrote: > > > > > > On Wed, Feb 14, 2018 at 02:25:56PM -0800, Han Zhou wrote: > > > > On Wed, Feb 14, 2018 at 1:40 PM, Ben Pfaff wrote: > > > > > > > > > > On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > > > > > > I remember there was a patch for ACL group in OVN, so that instead > > of > > > > R*P > > > > > > rows we will have only R + P rows, but didn't see it went through. > > > > > > > > > > I don't remember that. Any chance you could point me to it? > > > > > > > > Yes, I found it: > > > > > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/321165.html > > > > > > > > And I made a mistake in my previous text. It is about port group, which > > is > > > > what we need here, rather than ACL group. > > > > > > I guess what I'd like to see is an example of the problem that we're > > > trying to solve here: what does a typical ACL row for a security group > > > look like, and what parts of the row differ between its instance for one > > > port and another port? > > > > An ACL for a Neutron SG rule: ingress tcp dport=22, is something like: > > to-lport 1000 "outport==\"\" && ip4 && tcp && > > tcp.dst==22" allow-related > > > > All ports bound to the same SG will have an ACL like this, and the only > > difference between one port and another is the part. > > Well, then it's really easy to merge all of them if we accept Zong's > patches above or something similar. I had no idea! > > Is anyone willing to rebase those patches against current master? I had > some feedback on them that wasn't ever addressed (and I'd probably have > a little more), which is the only reason that they weren't committed as > far as I can tell. Cool. I'd be happy to rebase them, probably next week. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 03:29:34PM -0800, Han Zhou wrote: > On Wed, Feb 14, 2018 at 3:08 PM, Ben Pfaff wrote: > > > > On Wed, Feb 14, 2018 at 02:25:56PM -0800, Han Zhou wrote: > > > On Wed, Feb 14, 2018 at 1:40 PM, Ben Pfaff wrote: > > > > > > > > On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > > > > > I remember there was a patch for ACL group in OVN, so that instead > of > > > R*P > > > > > rows we will have only R + P rows, but didn't see it went through. > > > > > > > > I don't remember that. Any chance you could point me to it? > > > > > > Yes, I found it: > > > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/321165.html > > > > > > And I made a mistake in my previous text. It is about port group, which > is > > > what we need here, rather than ACL group. > > > > I guess what I'd like to see is an example of the problem that we're > > trying to solve here: what does a typical ACL row for a security group > > look like, and what parts of the row differ between its instance for one > > port and another port? > > An ACL for a Neutron SG rule: ingress tcp dport=22, is something like: > to-lport 1000 "outport==\"\" && ip4 && tcp && > tcp.dst==22" allow-related > > All ports bound to the same SG will have an ACL like this, and the only > difference between one port and another is the part. Well, then it's really easy to merge all of them if we accept Zong's patches above or something similar. I had no idea! Is anyone willing to rebase those patches against current master? I had some feedback on them that wasn't ever addressed (and I'd probably have a little more), which is the only reason that they weren't committed as far as I can tell. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Ok let me paste some example but feel free to ask for any further details. 1 Logical Switch with 5 ports and 8 ACLs per port: # ovn-nbctl show switch c1fac5d4-b682-4078-9282-61cfa6383893 (neutron-d35e99a5-d9e9-4bc5-9ad4-08e0941f1820) (aka test_net) port 8a8be79b-7a24-4a19-b952-c68d839e0164 (aka port2) addresses: ["fa:16:3e:eb:c1:fe 10.1.0.12"] port 181e3f51-9d16-43fe-a96c-195bf93f2cad type: localport addresses: ["fa:16:3e:46:bd:58 10.1.0.2"] port 340ca12f-8a4c-4441-87f6-885fb81f8964 (aka port5) addresses: ["fa:16:3e:60:a7:2e 10.1.0.11"] port 8883e188-ed97-4f66-954f-b61a7784bf58 (aka port1) addresses: ["fa:16:3e:7a:e1:90 10.1.0.4"] port 1031f2d3-b1ea-473b-b033-92c15c4df005 (aka port4) addresses: ["fa:16:3e:2a:fa:4a 10.1.0.15"] port a75b6392-bb94-435a-abff-42edcb12fcf2 (aka port3) addresses: ["fa:16:3e:fa:c4:0d 10.1.0.10"] * Logical Switch: _uuid : c1fac5d4-b682-4078-9282-61cfa6383893 acls: [02070cc2-4910-407e-b396-81e603e1aa41, 02bdeb85-3030-43cf-9516-6608c3c490c0, 0952ab80-31b1-4fd1-975f-97ae6a9737a4, 0b47a395-bcb8-4d4d-abb5-92b245d34584, 1b2e5465-9ede-443d-b8ad-a6650d34d122, 224767cc-1d9b-4a92-a28f-31371c49f2b5, 2756e656-fd67-4298-8f36-37cdad8cd77b, 31f28d63-7b8f-4857-9b44-01efab624065, 3378f438-e3b0-4237-97f4-7e13c75b0853, 397b0944-ede3-4bc1-9f8b-7132fe398bbc, 3f4d056f-938c-4798-aa03-4dd1b6692ffe, 41fc3152-865b-4970-a2ba-75ce223a5b1b, 469e59ce-3d29-4534-9be4-23a77030f276, 4963ff86-b4f6-4481-9718-69ab6f98e823, 4b473d3a-d2cf-4bf6-b821-9f9a2513199e, 64078115-ca20-4e24-984f-cec39e0f952d, 6d3a4e24-396f-456a-864c-fa66babccf29, 76dc882e-b57c-4f07-bf79-38b29404bcea, 78d22025-4a8e-4693-ad3d-2e6139405114, 79474822-bde6-4764-bbe9-44eed8aa81ff, 80e48478-5b79-4ede-886b-02ba5966fa74, 84368ef1-25f4-4d28-a42c-a63d224b6b8d, 966778a8-08e9-4d94-8338-3b362b1aefc7, a1838363-c906-41db-b7de-02aee4f02c67, a5dc65f7-22a4-4aa9-9c47-0994afa1e65b, aa604f5c-d082-4a72-a1af-0963fdd57b13, aeccba8e-a81a-4596-84e2-72a5faeaadc2, b001199a-3b81-41db-b013-2a28b471f5d8, b10116b8-ee5d-42a5-9ab6-748eca0ff594, b14cbff7-47b3-4c9d-9f94-bfa435227c17, b4544d68-5bed-4b69-8bb7-254300a72c70, c553fd14-cb47-4646-b7f2-2a47041ca68c, caa570c9-e8d8-44a9-b3b1-d5a044f54a9e, e4f88595-7130-4c21-a6d8-b7130272c797, e7ddfd73-2c41-4fc5-a6c6-87cb24d710b1, f0493516-e561-41cd-b748-df992aff3308, f2ecd261-c860-4dee-ab78-86547a72591e, f37aeecf-da41-4964-b913-6b83e92a5716, f92f1023-56ec-457d-82cb-5547b27c8403, fa8dd4e0-8bdc-4d63-84de-ebe952f95cfd] dns_records : [] external_ids: {"neutron:network_name"=test_net, "neutron:qos_policy_id"=null, "neutron:revision_number"="4"} load_balancer : [] name: "neutron-d35e99a5-d9e9-4bc5-9ad4-08e0941f1820" other_config: {} ports : [003491b0-38cb-4aa9-a2b1-922020dee8dd, 14abb895-8696-41a0-9f4a-e02c8a488310, 2d23f0a0-38e7-4942-988b-c9a1345e3f82, 33c5d2a0-2ae9-496c-972f-1fccb37eac9d, a8693978-52f9-4876-a928-2b2232195122, f2924201-8163-4ea6-a55f-89e663ecda6d] * ACLs for port1: _uuid : fa8dd4e0-8bdc-4d63-84de-ebe952f95cfd action : allow direction : from-lport external_ids: {"neutron:lport"="8883e188-ed97-4f66-954f-b61a7784bf58"} log : false match : "inport == \"8883e188-ed97-4f66-954f-b61a7784bf58\" && ip4 && ip4.dst == {255.255.255.255, 10.1.0.0/16} && udp && udp.src == 68 && udp.dst == 67" name: [] priority: 1002 severity: [] _uuid : 79474822-bde6-4764-bbe9-44eed8aa81ff action : allow-related direction : from-lport external_ids: {"neutron:lport"="8883e188-ed97-4f66-954f-b61a7784bf58", "neutron:security_group_rule_id"="2603769a-c388-41f7-af13-bbd80991db63"} log : false match : "inport == \"8883e188-ed97-4f66-954f-b61a7784bf58\" && ip4" name: [] priority: 1002 severity: [] _uuid : 397b0944-ede3-4bc1-9f8b-7132fe398bbc action : allow-related direction : to-lport external_ids: {"neutron:lport"="8883e188-ed97-4f66-954f-b61a7784bf58", "neutron:security_group_rule_id"="953242fd-bc6d-4100-bff0-c6d4e1879e28"} log : false match : "outport == \"8883e188-ed97-4f66-954f-b61a7784bf58\" && ip4 && ip4.src == 0.0.0.0/0 && icmp4" name: [] priority: 1002 severity: [] _uuid : 4963ff86-b4f6-4481-9718-69ab6f98e823 action : allow-related direction : from-lport external_ids: {"neutron:lport"="8883e188-ed97-4f66-954f-b61a7784bf58", "neutron:security_group_rule_id"="e4a6d2ae-4946-42e3-934d-a67856226902"} log : false match : "inport == \"8883e188-ed97-4f66-954f-b61a7784bf58\" && ip4 && ip4.dst == 0.0.0.0/0 && tcp" name: [] priority: 1002 severity
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 3:08 PM, Ben Pfaff wrote: > > On Wed, Feb 14, 2018 at 02:25:56PM -0800, Han Zhou wrote: > > On Wed, Feb 14, 2018 at 1:40 PM, Ben Pfaff wrote: > > > > > > On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > > > > I remember there was a patch for ACL group in OVN, so that instead of > > R*P > > > > rows we will have only R + P rows, but didn't see it went through. > > > > > > I don't remember that. Any chance you could point me to it? > > > > Yes, I found it: > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/321165.html > > > > And I made a mistake in my previous text. It is about port group, which is > > what we need here, rather than ACL group. > > I guess what I'd like to see is an example of the problem that we're > trying to solve here: what does a typical ACL row for a security group > look like, and what parts of the row differ between its instance for one > port and another port? An ACL for a Neutron SG rule: ingress tcp dport=22, is something like: to-lport 1000 "outport==\"\" && ip4 && tcp && tcp.dst==22" allow-related All ports bound to the same SG will have an ACL like this, and the only difference between one port and another is the part. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 02:25:56PM -0800, Han Zhou wrote: > On Wed, Feb 14, 2018 at 1:40 PM, Ben Pfaff wrote: > > > > On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > > > I remember there was a patch for ACL group in OVN, so that instead of > R*P > > > rows we will have only R + P rows, but didn't see it went through. > > > > I don't remember that. Any chance you could point me to it? > > Yes, I found it: > > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html > https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/321165.html > > And I made a mistake in my previous text. It is about port group, which is > what we need here, rather than ACL group. I guess what I'd like to see is an example of the problem that we're trying to solve here: what does a typical ACL row for a security group look like, and what parts of the row differ between its instance for one port and another port? ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Thanks a lot Han and Ben for looking into this! On Wed, Feb 14, 2018 at 9:34 PM, Han Zhou wrote: > > > On Wed, Feb 14, 2018 at 9:45 AM, Ben Pfaff wrote: > > > > On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: > > > Thanks for your inputs. I need to look more carefully into the patch > you > > > submitted but it looks like, at least, we'll be reducing the number of > > > calls to Datum.__cmp__ which should be good. > > > > Thanks. Please do take a look. It's a micro-optimization but maybe > > it'll help? > > > > > I probably didn't explain it very well. Right now we have N processes > > > for Neutron server (in every node). Each of those opens a connection > > > to NB db and they subscribe to updates from certain tables. Each time > > > a change happens, ovsdb-server will send N update2 messages that has > > > to be processed in this "expensive" way by each of those N > > > processes. My proposal (yet to be refined) would be to now open N+1 > > > connections to ovsdb-server and only subscribe to notifications from 1 > > > of those. So every time a new change happens, ovsdb-server will send 1 > > > update2 message. This message will be processed (using Py IDL as we do > > > now) and once processed, send it (mcast maybe?) to the rest N > > > processes. This msg could be simply a Python object serialized and > > > we'd be saving all this Datum, Atom, etc. processing by doing it just > > > once. > > > Daniel, I understand that the update2 messages sending would consume NB > ovsdb-server CPU and processing those update would consume neutron server > process CPU. However, are we sure it is the bottleneck for port creation? > No, I don't think this is the bottleneck at this stage with only 800 ports being created at that pace. As you say, I'm also not keen on doing such a big change right now but it's something to bear in mind for the future as ~50% of total processing time being spent in data.py/uuid.py is not neglectable. At least, it'd be nice to try Ben's suggestions and I'll do it as soon as I can. > > From ovsdb-server point of view, sending updates to tens of clients should > not be the bottleneck, considering that we have a lot more clients on HVs > for SB ovsdb-server. > Same would go for SB ovsdb-server though but yeah... I agree. > > From clients point of view, I think it is more of memory overhead than > CPU, and it also depends on how many neutron processes are running on the > same node. I didn't find neutron process CPU in your charts. I am hesitate > for such big change before we are clear about the bottleneck. The chart of > port creation time is very nice, but do we know which part of code > contributed to the linear growth? Do we have profiling for the time spent > in ovn_client.add_acls()? > This is my next step. It seems pretty clear that ACLs are the bottleneck. As a note, unsurprisingly, same goes for port deletion: $ date; for i in {1..800}; do openstack port delete port$i; done; date Wed Feb 14 17:45:07 UTC 2018 Wed Feb 14 19:36:09 UTC 2018 Same behavior shown during deletion as can be seen at [0]. [0] https://imgur.com/a/Dam10 > > OK. It's an optimization that does the work in one place rather than N > > places, so definitely a win from a CPU cost point of view, but it trades > > performance for increased complexity. It sounds like performance is > > really important so maybe the increased complexity is a fair trade. > > > > We might also be able to improve performance by using native code for > > some of the work. Were these tests done with the native code JSON > > parser that comes with OVS? It is dramatically faster than the Python > > code. > Yes, these tests were done using the native JSON parser. I saw that Python parsing had a 5% over the total execution time so recompiling with the --enabled-shared made it use the native parser. However it didn't make a big difference in the port creation times. In my first tests I also saw some reconnections from clients to ovsdb-server especially when the number of ports increased. (not many though, just a couple of them) during the 800 ports. I then changed the inactivity probe to 180s for both NB and SB connections but still the results were pretty much the same even though I didn't see any reconnections this time. > > > > On Tue, Feb 13, 2018 at 8:32 PM, Ben Pfaff wrote: > > > > > > > Can you sketch the rows that are being inserted or modified when a > port > > > > is added? I would expect something like this as a minimum: > > > > > > > > * Insert one Logical_Switch_Port row. > > > > > > > > * Add pointer to Logical_Switch_Port to ports column in one > row > > > > in Logical_Switch. > > > > > > > > In addition it sounds like currently we're seeing: > > > > > > > > * Add one ACL row per security group rule. > > > > > > > > * Add pointers to ACL rows to acls column in one row in > > > > Logical_Switch. > > > > > > > This is what happens when we
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 1:40 PM, Ben Pfaff wrote: > > On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > > I remember there was a patch for ACL group in OVN, so that instead of R*P > > rows we will have only R + P rows, but didn't see it went through. > > I don't remember that. Any chance you could point me to it? Yes, I found it: https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/077118.html https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/321165.html And I made a mistake in my previous text. It is about port group, which is what we need here, rather than ACL group. > > > Is this also a good use case of conjuncture? > > Maybe, but that happens at a different level of the system. Yes, if we have the capability to represent port group/set in ACLs, then it is possible for ovn-controller to translate into flows using conjuncture, but maybe it is a less valuable optimization since most ports from a group are not landing on same HV. Thanks, Han ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 12:34:19PM -0800, Han Zhou wrote: > I remember there was a patch for ACL group in OVN, so that instead of R*P > rows we will have only R + P rows, but didn't see it went through. I don't remember that. Any chance you could point me to it? > Is this also a good use case of conjuncture? Maybe, but that happens at a different level of the system. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 9:45 AM, Ben Pfaff wrote: > > On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: > > Thanks for your inputs. I need to look more carefully into the patch you > > submitted but it looks like, at least, we'll be reducing the number of > > calls to Datum.__cmp__ which should be good. > > Thanks. Please do take a look. It's a micro-optimization but maybe > it'll help? > > > I probably didn't explain it very well. Right now we have N processes > > for Neutron server (in every node). Each of those opens a connection > > to NB db and they subscribe to updates from certain tables. Each time > > a change happens, ovsdb-server will send N update2 messages that has > > to be processed in this "expensive" way by each of those N > > processes. My proposal (yet to be refined) would be to now open N+1 > > connections to ovsdb-server and only subscribe to notifications from 1 > > of those. So every time a new change happens, ovsdb-server will send 1 > > update2 message. This message will be processed (using Py IDL as we do > > now) and once processed, send it (mcast maybe?) to the rest N > > processes. This msg could be simply a Python object serialized and > > we'd be saving all this Datum, Atom, etc. processing by doing it just > > once. > Daniel, I understand that the update2 messages sending would consume NB ovsdb-server CPU and processing those update would consume neutron server process CPU. However, are we sure it is the bottleneck for port creation? >From ovsdb-server point of view, sending updates to tens of clients should not be the bottleneck, considering that we have a lot more clients on HVs for SB ovsdb-server. >From clients point of view, I think it is more of memory overhead than CPU, and it also depends on how many neutron processes are running on the same node. I didn't find neutron process CPU in your charts. I am hesitate for such big change before we are clear about the bottleneck. The chart of port creation time is very nice, but do we know which part of code contributed to the linear growth? Do we have profiling for the time spent in ovn_client.add_acls()? > OK. It's an optimization that does the work in one place rather than N > places, so definitely a win from a CPU cost point of view, but it trades > performance for increased complexity. It sounds like performance is > really important so maybe the increased complexity is a fair trade. > > We might also be able to improve performance by using native code for > some of the work. Were these tests done with the native code JSON > parser that comes with OVS? It is dramatically faster than the Python > code. > > > On Tue, Feb 13, 2018 at 8:32 PM, Ben Pfaff wrote: > > > > > Can you sketch the rows that are being inserted or modified when a port > > > is added? I would expect something like this as a minimum: > > > > > > * Insert one Logical_Switch_Port row. > > > > > > * Add pointer to Logical_Switch_Port to ports column in one row > > > in Logical_Switch. > > > > > > In addition it sounds like currently we're seeing: > > > > > > * Add one ACL row per security group rule. > > > > > > * Add pointers to ACL rows to acls column in one row in > > > Logical_Switch. > > > > > This is what happens when we create a port in OpenStack (without > > binding it) which belongs to a SG which allows ICMP and SSH traffic > > and drops the rest [0] > > > > Basically, you were right and only thing missing was adding the new > > address to the Address_Set table. > > OK. > > It sounds like the real scaling problem here is that for R security > group rules and P ports, we have R*P rows in the ACL table. Is that > correct? Should we aim to solve that problem? I think this might be the most valuable point to optimize for the create_port scenario from Neutron. I remember there was a patch for ACL group in OVN, so that instead of R*P rows we will have only R + P rows, but didn't see it went through. Is this also a good use case of conjuncture? > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Wed, Feb 14, 2018 at 11:27:11AM +0100, Daniel Alvarez Sanchez wrote: > Thanks for your inputs. I need to look more carefully into the patch you > submitted but it looks like, at least, we'll be reducing the number of > calls to Datum.__cmp__ which should be good. Thanks. Please do take a look. It's a micro-optimization but maybe it'll help? > I probably didn't explain it very well. Right now we have N processes > for Neutron server (in every node). Each of those opens a connection > to NB db and they subscribe to updates from certain tables. Each time > a change happens, ovsdb-server will send N update2 messages that has > to be processed in this "expensive" way by each of those N > processes. My proposal (yet to be refined) would be to now open N+1 > connections to ovsdb-server and only subscribe to notifications from 1 > of those. So every time a new change happens, ovsdb-server will send 1 > update2 message. This message will be processed (using Py IDL as we do > now) and once processed, send it (mcast maybe?) to the rest N > processes. This msg could be simply a Python object serialized and > we'd be saving all this Datum, Atom, etc. processing by doing it just > once. OK. It's an optimization that does the work in one place rather than N places, so definitely a win from a CPU cost point of view, but it trades performance for increased complexity. It sounds like performance is really important so maybe the increased complexity is a fair trade. We might also be able to improve performance by using native code for some of the work. Were these tests done with the native code JSON parser that comes with OVS? It is dramatically faster than the Python code. > On Tue, Feb 13, 2018 at 8:32 PM, Ben Pfaff wrote: > > > Can you sketch the rows that are being inserted or modified when a port > > is added? I would expect something like this as a minimum: > > > > * Insert one Logical_Switch_Port row. > > > > * Add pointer to Logical_Switch_Port to ports column in one row > > in Logical_Switch. > > > > In addition it sounds like currently we're seeing: > > > > * Add one ACL row per security group rule. > > > > * Add pointers to ACL rows to acls column in one row in > > Logical_Switch. > > > This is what happens when we create a port in OpenStack (without > binding it) which belongs to a SG which allows ICMP and SSH traffic > and drops the rest [0] > > Basically, you were right and only thing missing was adding the new > address to the Address_Set table. OK. It sounds like the real scaling problem here is that for R security group rules and P ports, we have R*P rows in the ACL table. Is that correct? Should we aim to solve that problem? ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Hi Ben, Thanks for your inputs. I need to look more carefully into the patch you submitted but it looks like, at least, we'll be reducing the number of calls to Datum.__cmp__ which should be good. For the rest of the things, let me answer inline. On Tue, Feb 13, 2018 at 8:32 PM, Ben Pfaff wrote: > On Tue, Feb 13, 2018 at 12:39:56PM +0100, Daniel Alvarez Sanchez wrote: > > Hi folks, > > > > As we're doing some performance tests in OpenStack using OVN, > > we noticed that as we keep creating ports, the time for creating a > > single port increases. Also, ovn-northd CPU consumption is quite > > high (see [0] which shows the CPU consumption when creating > > 1000 ports and deleting them. Last part where CPU is at 100% is > > when all the ports get deleted). > > > > With 500 ports in the same Logical Switch, I did some profiling > > of OpenStack neutron-server adding 10 more ports to that Logical > > Switch. Currently, neutron-server spawns different API workers > > (separate processes) which open connections to OVN NB so every > > time an update message is sent from ovsdb-server it'll be processed > > by all of them. > > > > In my profiling, I used GreenletProfiler in all those processes to > produce > > a trace file and then merged all of them together to aggregate the > > results. In those tests I used OVS master branch compiled it with > > shared libraries to make use of the JSON C parser. Still, I've seen > > that most of the time's been spent in the following two modules: > > > > - python/ovs/db/data.py: 33% > > - uuid.py: 21% > > > > For the data.py module, this is the usage (self time): > > > > Atom.__lt__ 16.25% 8283 calls > > from_json:118 6.18% 406935 calls > > Atom.__hash__ 3.48% 1623832 calls > > from_json:328 2.01% 5040 calls > > > > While for the uuid module: > > > > UUID.__cmp__ 12.84% 3570975 calls > > UUID.__init__ 4.06% 362541 calls > > UUID.__hash__ 2.96% 1800 calls > > UUID.__str__1.03% 355016 calls > > > > Most of the calls to Atom.__lt__ come from > > BaseOvnIdl.__process_update2(idl.py) > > -> BaseOvnIdl.__row_update(idl.py) -> Datum.__cmp__(data.py) -> > > Atom.__cmp__(data.py). > > I don't know Python well enough to whether these are "real" or correct > optimizations, but the following strike me as possible low-hanging fruit > micro-optimizations: > > diff --git a/python/ovs/db/data.py b/python/ovs/db/data.py > index 9e57595f7513..dc816f64708f 100644 > --- a/python/ovs/db/data.py > +++ b/python/ovs/db/data.py > @@ -76,12 +76,12 @@ class Atom(object): > def __eq__(self, other): > if not isinstance(other, Atom) or self.type != other.type: > return NotImplemented > -return True if self.value == other.value else False > +return self.value == other.value > > def __lt__(self, other): > if not isinstance(other, Atom) or self.type != other.type: > return NotImplemented > -return True if self.value < other.value else False > +return self.value < other.value > > def __cmp__(self, other): > if not isinstance(other, Atom) or self.type != other.type: > > We could also reduce the number of calls to Datum.__cmp__ by recognizing > that if we've already found one change we don't have to do any more > comparisons, something like this: > > diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py > index 60548bcf50b6..2e550adfdf1c 100644 > --- a/python/ovs/db/idl.py > +++ b/python/ovs/db/idl.py > @@ -447,6 +447,7 @@ class Idl(object): > raise error.Error(" is not an object", >table_updates) > > +changed = False > for table_name, table_update in six.iteritems(table_updates): > table = self.tables.get(table_name) > if not table: > @@ -472,8 +473,8 @@ class Idl(object): >% (table_name, uuid_string)) > > if version == OVSDB_UPDATE2: > -if self.__process_update2(table, uuid, row_update): > -self.change_seqno += 1 > +changed = self.__process_update2(table, uuid, > row_update, > + changed) > continue > > parser = ovs.db.parser.Parser(row_update, "row-update") > @@ -485,12 +486,12 @@ class Idl(object): > raise error.Error(' missing "old" and ' >'"new" members', row_update) > > -if self.__process_update(table, uuid, old, new): > -self.change_seqno += 1 > +changed = self.__process_update(table, uuid, old, new, > changed) > +if changed: > +self.change_seqno += 1 > > -def __process_update2(self, table, uuid, row_update): > +def __process_update2(self, table, uuid, row_update, changed): > row = table.rows.
Re: [ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
On Tue, Feb 13, 2018 at 12:39:56PM +0100, Daniel Alvarez Sanchez wrote: > Hi folks, > > As we're doing some performance tests in OpenStack using OVN, > we noticed that as we keep creating ports, the time for creating a > single port increases. Also, ovn-northd CPU consumption is quite > high (see [0] which shows the CPU consumption when creating > 1000 ports and deleting them. Last part where CPU is at 100% is > when all the ports get deleted). > > With 500 ports in the same Logical Switch, I did some profiling > of OpenStack neutron-server adding 10 more ports to that Logical > Switch. Currently, neutron-server spawns different API workers > (separate processes) which open connections to OVN NB so every > time an update message is sent from ovsdb-server it'll be processed > by all of them. > > In my profiling, I used GreenletProfiler in all those processes to produce > a trace file and then merged all of them together to aggregate the > results. In those tests I used OVS master branch compiled it with > shared libraries to make use of the JSON C parser. Still, I've seen > that most of the time's been spent in the following two modules: > > - python/ovs/db/data.py: 33% > - uuid.py: 21% > > For the data.py module, this is the usage (self time): > > Atom.__lt__ 16.25% 8283 calls > from_json:118 6.18% 406935 calls > Atom.__hash__ 3.48% 1623832 calls > from_json:328 2.01% 5040 calls > > While for the uuid module: > > UUID.__cmp__ 12.84% 3570975 calls > UUID.__init__ 4.06% 362541 calls > UUID.__hash__ 2.96% 1800 calls > UUID.__str__1.03% 355016 calls > > Most of the calls to Atom.__lt__ come from > BaseOvnIdl.__process_update2(idl.py) > -> BaseOvnIdl.__row_update(idl.py) -> Datum.__cmp__(data.py) -> > Atom.__cmp__(data.py). I don't know Python well enough to whether these are "real" or correct optimizations, but the following strike me as possible low-hanging fruit micro-optimizations: diff --git a/python/ovs/db/data.py b/python/ovs/db/data.py index 9e57595f7513..dc816f64708f 100644 --- a/python/ovs/db/data.py +++ b/python/ovs/db/data.py @@ -76,12 +76,12 @@ class Atom(object): def __eq__(self, other): if not isinstance(other, Atom) or self.type != other.type: return NotImplemented -return True if self.value == other.value else False +return self.value == other.value def __lt__(self, other): if not isinstance(other, Atom) or self.type != other.type: return NotImplemented -return True if self.value < other.value else False +return self.value < other.value def __cmp__(self, other): if not isinstance(other, Atom) or self.type != other.type: We could also reduce the number of calls to Datum.__cmp__ by recognizing that if we've already found one change we don't have to do any more comparisons, something like this: diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py index 60548bcf50b6..2e550adfdf1c 100644 --- a/python/ovs/db/idl.py +++ b/python/ovs/db/idl.py @@ -447,6 +447,7 @@ class Idl(object): raise error.Error(" is not an object", table_updates) +changed = False for table_name, table_update in six.iteritems(table_updates): table = self.tables.get(table_name) if not table: @@ -472,8 +473,8 @@ class Idl(object): % (table_name, uuid_string)) if version == OVSDB_UPDATE2: -if self.__process_update2(table, uuid, row_update): -self.change_seqno += 1 +changed = self.__process_update2(table, uuid, row_update, + changed) continue parser = ovs.db.parser.Parser(row_update, "row-update") @@ -485,12 +486,12 @@ class Idl(object): raise error.Error(' missing "old" and ' '"new" members', row_update) -if self.__process_update(table, uuid, old, new): -self.change_seqno += 1 +changed = self.__process_update(table, uuid, old, new, changed) +if changed: +self.change_seqno += 1 -def __process_update2(self, table, uuid, row_update): +def __process_update2(self, table, uuid, row_update, changed): row = table.rows.get(uuid) -changed = False if "delete" in row_update: if row: del table.rows[uuid] @@ -511,9 +512,8 @@ class Idl(object): else: row_update = row_update['initial'] self.__add_default(table, row_update) -if self.__row_update(table, row, row_update): -changed = True -self.notify(ROW_CREATE, row) +changed = self.__row_update(table, row, row_update, changed
[ovs-discuss] OpenStack profiling with networking-ovn - port creation is slow
Hi folks, As we're doing some performance tests in OpenStack using OVN, we noticed that as we keep creating ports, the time for creating a single port increases. Also, ovn-northd CPU consumption is quite high (see [0] which shows the CPU consumption when creating 1000 ports and deleting them. Last part where CPU is at 100% is when all the ports get deleted). With 500 ports in the same Logical Switch, I did some profiling of OpenStack neutron-server adding 10 more ports to that Logical Switch. Currently, neutron-server spawns different API workers (separate processes) which open connections to OVN NB so every time an update message is sent from ovsdb-server it'll be processed by all of them. In my profiling, I used GreenletProfiler in all those processes to produce a trace file and then merged all of them together to aggregate the results. In those tests I used OVS master branch compiled it with shared libraries to make use of the JSON C parser. Still, I've seen that most of the time's been spent in the following two modules: - python/ovs/db/data.py: 33% - uuid.py: 21% For the data.py module, this is the usage (self time): Atom.__lt__ 16.25% 8283 calls from_json:118 6.18% 406935 calls Atom.__hash__ 3.48% 1623832 calls from_json:328 2.01% 5040 calls While for the uuid module: UUID.__cmp__ 12.84% 3570975 calls UUID.__init__ 4.06% 362541 calls UUID.__hash__ 2.96% 1800 calls UUID.__str__1.03% 355016 calls Most of the calls to Atom.__lt__ come from BaseOvnIdl.__process_update2(idl.py) -> BaseOvnIdl.__row_update(idl.py) -> Datum.__cmp__(data.py) -> Atom.__cmp__(data.py). The aggregated number of calls to BaseOvnIdl.__process_update2 is 1400 (and we're updating only 10 ports!!) while the total connections opened to NB database are 10: # netstat -np | grep 6641 | grep python | wc -l 10 * Bear in mind that those results above were aggregated across all processes. Looks like the main culprit for this explosion could be the way we handle ACLs. As every time we create a port, it'll belong to a Neutron security group (OVN Address Set) and we'll add a new ACL for every Neutron security group rule. If we patch the code to skip the ACL part, the time for creating a port remains stable over the time. >From the comparison tests against ML2/OVS (reference implementation), OVN outperforms in most of the operations except for the port creation where we can see it can become a bottleneck. Before optimizing/redesigning the ACL part, we could do some other changes to the way we handle notifications from OVSDB: eg., instead of having multiple processes receiving *all* notifications, we could have one single process subscribed to those notifications and send a more optimized (already parsed) multicast notification to all listening processes to keep their own in-memory copies of the DB up to date. All processes would connect to NB database in "write-only" mode to commit their transactions however. Even though this last paragraph would best fit in OpenStack ML I want to raise it here for feedback and see if someone can see some "immediate" optimization for the way we're processing notifications from OVSDB. Maybe some python binding to do it in C? :) Any feedback, comments or suggestions are highly appreciated :) Best, Daniel Alvarez [0] https://snapshot.raintank.io/dashboard/snapshot/dwbhn0Z1zVTh9kI5j6mCVySx8TvrP45m?orgId=2 ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss