On Mon, Nov 18, 2019 at 8:22 AM Han Zhou <hz...@ovn.org> wrote: > > On Sat, Nov 16, 2019 at 4:03 AM Numan Siddique <num...@ovn.org> wrote: > > > > > > > > On Fri, Nov 15, 2019 at 1:18 PM Han Zhou <hz...@ovn.org> wrote: > >> > >> > >> > >> On Thu, Nov 14, 2019 at 10:25 PM Numan Siddique <num...@ovn.org> wrote: > >> > > >> > On Thu, Nov 14, 2019 at 11:44 PM Han Zhou <hz...@ovn.org> wrote: > >> > > > >> > > On Wed, Nov 13, 2019 at 10:27 AM Numan Siddique <num...@ovn.org> > wrote: > >> > > > > >> > > > On Thu, Oct 31, 2019 at 2:43 AM Han Zhou <hz...@ovn.org> wrote: > >> > > > > > >> > > > > The series supports interconnecting multiple OVN deployments > (e.g. > >> > > located at > >> > > > > multiple data centers) through logical routers connected with > tansit > >> > > logical > >> > > > > switches with overlay tunnels, managed through OVN control > plane. See > >> > > the > >> > > > > ovn-architecture.rst document updates for more details, and find > the > >> > > > > instructions in Documentation/tutorials/ovn-interconnection.rst. > >> > > > > > >> > > > > Han Zhou (13): > >> > > > > ovn-architecture: Add documentation for OVN interconnection > feature. > >> > > > > ovn-inb: Interconnection northbound DB schema and CLI. > >> > > > > ovn-isb: Interconnection southbound DB schema and CLI. > >> > > > > ovn-ic: Interconnection controller with AZ registeration. > >> > > > > ovn-northd.c: Refactor allocate_tnlid. > >> > > > > ovn-ic: Transit switch controller. > >> > > > > ovn-sb: Add columns is_interconn and is_remote to Chassis. > >> > > > > ovn-ic: Interconnection gateway controller. > >> > > > > ovn-ic: Interconnection port controller. > >> > > > > ovn.at: e2e test for OVN interconnection. > >> > > > > ovn-ctl: Refactor to reduce redundant code. > >> > > > > ovn-ctl: Support commands for interconnection. > >> > > > > tutorial: Add tutorial for OVN Interconnection. > >> > > > > > >> > > > > >> > > > Hi Han, > >> > > > > >> > > > Thanks for working on this feature. It's an interesting use case. > >> > > > > >> > > > I had a quick look at all the patches. > >> > > > > >> > > > >> > > Numan, thanks a lot for the thorough review! Please see my response > inlined. > >> > > > >> > > > I have few comments > >> > > > > >> > > > 1. I would suggest to rename the DBs as ovn-ic-nb.ovsschema (and > the > >> > > > same for ovn-isb). > >> > > > The DB name is - OVN_IC_Northbound. So it would make sense to > have > >> > > > - ovn-ic-nb.ovsschema > >> > > > I would also suggest to rename the utilities to ovn-ic-nbctl > and > >> > > > ovn-ic-sbctl. > >> > > > With ovn-inbctl/ovn-isbctl, it is really confusing. > >> > > > > >> > > Sure, I felt not quite convenient with two dashes in a command name. > I > >> > > agree that ovn-ic-nbctl and ovn-ic-sbctl are more clear. I can > change it. > >> > > > >> > > > 2. ovn-ic service writes to interconnect south db, ovn north db and > >> > > > ovn south db. Writing to ic south db and > >> > > > ovn north db is fine. But it seems a little odd for ovn-ic to > >> > > > write to south db. From what I understand it writes > >> > > > to south db for 3 purposes > >> > > > a. Updating the tunnel_key column of datapath_binding > >> > > > representing the transit switch > >> > > > b. Updating the key column of port_binding representing the > >> > > > logical router port connecting to the transit switch. > >> > > > c. Creating chassis rows for remote gateway chassis. > >> > > > > >> > > > I think it's better if ovn-ic can delegate all these to > ovn-northd. > >> > > > For (a) and (b), ovn-ic can set the generated tunnel key > >> > > > in the other_config/options column of Logical switch/Logical > switch > >> > > > port. ovn-northd can internally set this value to > >> > > > the south db. > >> > > > > >> > > > For (c), I think its better we add another table - > Remote_Chassis > >> > > > (or some other appropriate name) . ovn-ic will create rows > >> > > > in this table for each remote chassis and ovn-northd will create > >> > > > chassis row in south db. > >> > > > We already have Gateway_Chassis table in North db. So I think > it's > >> > > > fine if we add Remote_Chassis table. The only odd thing > >> > > > would be is to store the encap details in North db. > >> > > > > >> > > > With this, ovn-ic, doesn't need to worry about syncing between > >> > > > interconnect south db and ovn south db. In my opinion ovn-ic > >> > > > should act more like CMS to each availability zone and hence > should > >> > > > not do any write transactions to the south db. > >> > > > > >> > > > Any concerns with this proposed approach ? > >> > > > > >> > > We could avoid ovn-ic writing directly to SB with some extra logic in > >> > > northd, but I don't see any problem for ovn-ic to update SB > directly. First > >> > > of all, we have hypervisors creating and updating SB directly for > Chassis > >> > > and Encap records. The design here is that ovn-ic updates Chassis > and Encap > >> > > on behalf of remote gateway nodes, which I think is straightforward > and > >> > > reasonable. Similarly, port-binding's chassis column is updated the > same > >> > > way as how hypervisors are updating it. Secondly, for tunnel keys > updating, > >> > > it may seem graceful to update from northd, since northd is the only > client > >> > > that updates tunnel keys today, but since ovn-ic is responsible for > >> > > calculating these keys, and it already has connection to SB, I think > it is > >> > > just more natural and efficient to update it directly, to avoid the > extra > >> > > logic and unnecessary latency from northd with another round of > >> > > computation. The scope of the ovn-ic is only for the interconnection > >> > > objects, so I don't see any conflict or ownership ambiguity here. > What do > >> > > you think? > >> > > >> > Regarding the ovn-ic writing to SB DB, I would like to get other's > >> > opinion as well. > >> > I agree to your points, but at the same time concerned with few things > like > >> > - What about RBAC for ovn-ic ? > >> > - Right now we have RBAC for ovn-controller in writing to the SB DB. > >> > - Do we want something similar for ovn-ic if ovn-ic writes to SB DB. > >> > - Does CMS need to do something similar for ovn-ic, like how it is > documented > >> > here - > https://mail.openvswitch.org/pipermail/ovs-dev/2019-November/364481.html > >> > (related discussion here - > >> > > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-November/049501.html > ) > >> > if no RBAC for ovn-ic ? > >> > > >> For RBAC, I think ovn-ic and northd are at the same position, because > they both are AZ level daemons, just focusing on different tasks. In theory > they can be put in same process, but I separated them for clarity. So from > RBAC perspective they should be just the same. > >> Today there is no role for northd, which I think is a flaw, as discussed > in the thread you pointed. It is not a big problem though, because a > workaround is simply creating a separate connection and use iptable rules > to restrict access from central nodes only. Same practice should be used > for ovn-ic. > >> > >> > @Mark Michelson @Dumitru Ceara You have any thoughts/concerns on > >> > ovn-ic writing to SB DB ? > >> > > >> > Regarding the tunnel key there are 2 things here > >> > (1) tunnel key for transit datapath > >> > (2) tunnel key for logical port connected to the transit switch > >> > > >> > For (1), you can completely avoid "Sync ISB TS tunnel key to AZ SB > >> > datapath" by storing the tunnel id > >> > in logical switch in NB DB like - > >> > nbrec_logical_switch_update_other_config_setkey(ls, > >> > "interconn-ts-tunnelkey, tunnelkey) > >> > > >> > ovn-northd when creating the datapath_binding row can set this value > >> > directly. With this approach I think the function > >> > ts_run() can be simplified a bit as it doesn't need to call - > >> > SBREC_DATAPATH_BINDING_FOR_EACH() > >> > > >> > >> I agree this approach should work. The code might be a little simpler > but the latency will be added because of the extra computer iteration from > northd to SB DB, although it might not be a big concern. So I don't have > strong preference for either approach. > >> > >> > Right now CMS is expected to create lsp-lrp ports on the transit > >> > switch - logical router on each AZ. > >> > Instead of this, can't CMS/user create these links in IC-NB table ? > >> > '' > >> > something like > >> > ovn-ic-nbctl ts-add ts0 > >> > ovn-ic-nbctl tsp-add ts0 az0 az0-lr0 MAC1 IP1 > >> > ovn-ic-nbctl tsp-add ts0 az1 az1-lr0 MAC2 IP2 > >> > > >> > (where az0-lr0 is logical router present in az0's NB DB and az1-lr0 > >> > is logical router > >> > present in az1's NB DB). > >> > > >> > ovn-ic will internally create lsp-lrp ports in its AZ's NB DB. It can > >> > also create 'Port_Binding' > >> > row in IC-SB DB and doesn't need to worry about syncing between the SB > >> > DB and IC-SB DB. > >> > > >> > This would probably solve (2) and ovn-ic can set the tunnel key when > >> > creating the lsp-lrp and ovn-northd > >> > will make use of this tunnel key when creating the port_binding rows > in SB DB. > >> > > >> > With this user/CMS doesn't have to worry about creating the lsp-lrps > >> > in each AZ and I think > >> > this seems more logical to me. > >> > > >> > Running "ovn-ic-nbctl show" will give a clear output to the user. > >> > > >> > >> This is a different operation model than what I had in mind. My initial > idea was that all configuration should be done at each AZ itself, and the > ovn-ic from each AZ will sync between each other automatically, so there is > no need for configuring the IC DBs by CMS/User. The reason why I end up > with a IC-NB DB is the use case of multi-tenancy, when an operator needs to > isolate different tenants on the backbone. They can create different TS for > this purpose. Each TS can represent a tenant or segmentation-zone. This > information should be specified by user and it is global, so it is stored > in IC-NB DB. Other than this, there is no need for global level > configurations. All the information that can be populated by ovn-ic is > stored in IC-SB. > >> > >> With your proposal, LRP's from each AZ is directly configured by user in > global IC-NB. It is a valid approach, too. > >> > >> However, I would prefer configuring them at AZ level for below > considerations: > >> - Try to avoid introducing new configurations. Other than the TS switch, > there is nothing new. All the operations of LRP creation stays the same way > as it is today, which I believe simplifies operation. > >> - The ownership is more clear. AZ information of each LRP is populated > automatically by ovn-ic from each AZ, and each AZ will only update its own > LRPs to global DB. Each AZ has full control of its own intention of joining > the global interconnection, deciding which router ports and gateways should > get involved. > >> - It might seem attractive to have a global view by configuring all > these LRPs at global level, but in fact we will still need to configure > static routes for each other at each AZ level. So the overall configuration > will be half-global and half-local, which might in fact seem less natural. > > > > > > I thought about the static routes. With my proposal, ovn-ic can > configure the static routes too because it knows the transit switch > configuration would have all the details. > > > I wanted to make the configuration at IC-NB as simple as possible. For the > static routes, currently they can be configured at each AZ, but I am also > planning a further improvement to avoid the configuration at all, by > learning the routes of edge routers from each other automatically through > IC-SB, but I will add it after the basic feature is consolidated. > > > > >> - The command ovn-ic-nbctl show may seem not showing enough information, > but in fact ovn-ic-sbctl show tells all details with the hierarchy of AZ -> > GW -> Ports (for each port it also shows its TS and subnet/IP). I think it > provides a global view of both logical and physical. > >> > >> If the change is for avoid updating tunnel keys to SB directly, we can > do similar approach as you suggested for TS. > >> > >> > If we take this approach, ovn-ic doesn't need to write to the SB DB > >> > except for creating 'Chassis' table in > >> > SB DB. We could avoid this also if we add Remote_Chassis in NB DB. But > >> > we can discuss further on this. > >> > > >> > Let me know what you think. > >> > > >> For writing to SB, I am ok to avoid writing tunnel keys (for TS and for > port-binding) directly to SB but through northd, although this would > introduce some latency which I think is not a big issue. > > > > > > Ok. > >> > >> But for chassis/encap/port-binding's chassis, I would still prefer to > update to SB directly, consistent with what we are doing from > ovn-controller. The only difference from ovn-controller is that RBAC is > necessary from ovn-controller but not from ovn-ic, since ovn-ic is AZ level > component that only runs on central nodes. > >> If in the future we want to avoid SB update from ovn-controller, too, > then we can make the same change in a consistent way then for both > ovn-controller and ovn-ic. > >> > > >> > > > >> > > > 3. In patch 7,its better to rename the ovs configuration option - > >> > > > "external_ids:is-interconn" to "external_ids:ovn-is-interconn". > >> > > > You also need to document it in ovn-controller-8.xml. > >> > > > > >> > > Agree! > >> > > > >> > > > Or maybe we can remove this option - external_ids:is-interconn. > We > >> > > > probably don't need this if we do like below > >> > > > > >> > > > 2 (c) can also be done this way > >> > > > - User creates transit switch. > >> > > > - ovn-ic creates transit switch in north db. > >> > > > - then the user adds the logical router port - logical > switch > >> > > > port to the transit switch in availability zone - az1 (for example) > >> > > > - then the user creates gw chassis - for example > >> > > > ovn-nbctl lrp-set-gateway-chassis lrp-lr1-ts1 > <gateway > >> > > > name> [priority] > >> > > > - ovn-ic on az1 ,learns this and creates a Gateway_Chassis > >> > > > row in the interconnect south db. > >> > > > - ovn-ic on other az's then create Remote_Chassis rows in > the > >> > > North db. > >> > > > - ovn-northd on other az's then create chassis row in > south db > >> > > > with "is_remote" set to true. > >> > > > > >> > > > I am not sure if this complicates further and hence its better > >> > > > that ovn-ic writes to the south dbs. But we can discuss further on > >> > > > this. > >> > > > > >> > > > >> > > I think this is a good idea and I am incline to it, because it > avoids one > >> > > configuration on the gateway node, which is good. > >> > > The only concern is that it makes the remote gateway sync more > dynamic - it > >> > > changes when LRP's gateway-chassis/ha-chassis settings change, which > may be > >> > > less efficient. The code of ovn-ic will be a little more complex. I > think > >> > > it should be ok, but I'd like to hear from you, too. > >> > > >> > I would prefer to go with this approach. Another advantage is that, > >> > ovn-controller don't > >> > have to establish tunnels with the remote chassis until it really has > >> > to. I think with the present > >> > approach it establishes tunnels even if a transit switch doesn't have > >> > any router ports ? > >> > Correct me if I am wrong here ? > >> > > >> > >> Yes you are right. I think it is mainly about static v.s. dynamic. The > benefit of static is that it is more predicable and efficient (maybe), > while the dynamic approach avoids the extra configuration of > "is-interconnection". I don't worry about the tunnel setup yet, since it is > merely a piece of data in the ovsdb table on the chassis, and there won't > be too many interconnection gateways. But I think I am inclined to change > as you suggested, to avoid the extra configuration. > > > > > > Ok. So to summarize from our discussion, v3 of this series would > > - avoid tunnel key updates to the SB DB in ovn-ic > > - but it will create remote chassis in SB DB right ? > > > > Yes. In addition, I will also remove the "is-interconn" configuration for > Chassis as you suggested, and determine the Chassis is for interconnection > if there is a LRP@TS assigned to it, by either gateway_chassis or by > ha_chassis. > Does this all sound good for v3?
Yes. Its sounds good to me. Numan > > > > > Thanks > > Numan > > > >> > >> > > >> > Thanks > >> > Numan > >> > > >> > > >> > > > >> > > > > >> > > > 4. Can you please add a section in ovn-architecture about the > "Journey > >> > > > of a packet in inter-az scenario" ? This would really > >> > > > help in understanding the feature. > >> > > > > >> > > Sure, I can add it. > >> > > > >> > > > Thanks > >> > > > Numan > >> > > > > >> > > > > >> > > > > .gitignore | 6 + > >> > > > > Documentation/automake.mk | 1 + > >> > > > > Documentation/tutorials/index.rst | 1 + > >> > > > > Documentation/tutorials/ovn-interconnection.rst | 188 ++++ > >> > > > > Makefile.am | 1 + > >> > > > > NEWS | 5 + > >> > > > > TODO.rst | 10 + > >> > > > > automake.mk | 75 ++ > >> > > > > controller/binding.c | 6 +- > >> > > > > controller/chassis.c | 14 + > >> > > > > debian/ovn-common.install | 2 + > >> > > > > debian/ovn-common.manpages | 4 + > >> > > > > ic/.gitignore | 2 + > >> > > > > ic/automake.mk | 10 + > >> > > > > ic/ovn-ic.8.xml | 111 +++ > >> > > > > ic/ovn-ic.c | 1050 > >> > > +++++++++++++++++++++++ > >> > > > > lib/.gitignore | 6 + > >> > > > > lib/automake.mk | 32 +- > >> > > > > lib/ovn-inb-idl.ann | 9 + > >> > > > > lib/ovn-isb-idl.ann | 9 + > >> > > > > lib/ovn-util.c | 92 ++ > >> > > > > lib/ovn-util.h | 15 + > >> > > > > northd/ovn-northd.c | 108 +-- > >> > > > > ovn-architecture.7.xml | 107 ++- > >> > > > > ovn-inb.ovsschema | 75 ++ > >> > > > > ovn-inb.xml | 371 ++++++++ > >> > > > > ovn-isb.ovsschema | 129 +++ > >> > > > > ovn-isb.xml | 582 > +++++++++++++ > >> > > > > ovn-nb.ovsschema | 5 +- > >> > > > > ovn-nb.xml | 28 +- > >> > > > > ovn-sb.ovsschema | 8 +- > >> > > > > ovn-sb.xml | 24 + > >> > > > > tests/automake.mk | 8 +- > >> > > > > tests/ovn-ic.at | 192 +++++ > >> > > > > tests/ovn-inbctl.at | 65 ++ > >> > > > > tests/ovn-isbctl.at | 112 +++ > >> > > > > tests/ovn-macros.at | 161 +++- > >> > > > > tests/ovn.at | 149 ++++ > >> > > > > tests/testsuite.at | 3 + > >> > > > > tutorial/ovs-sandbox | 78 +- > >> > > > > utilities/.gitignore | 4 + > >> > > > > utilities/automake.mk | 16 + > >> > > > > utilities/ovn-ctl | 423 ++++++++- > >> > > > > utilities/ovn-ctl.8.xml | 91 ++ > >> > > > > utilities/ovn-inbctl.8.xml | 174 ++++ > >> > > > > utilities/ovn-inbctl.c | 948 > >> > > ++++++++++++++++++++ > >> > > > > utilities/ovn-isbctl.8.xml | 148 ++++ > >> > > > > utilities/ovn-isbctl.c | 1015 > >> > > ++++++++++++++++++++++ > >> > > > > 48 files changed, 6528 insertions(+), 145 deletions(-) > >> > > > > create mode 100644 > Documentation/tutorials/ovn-interconnection.rst > >> > > > > create mode 100644 ic/.gitignore > >> > > > > create mode 100644 ic/automake.mk > >> > > > > create mode 100644 ic/ovn-ic.8.xml > >> > > > > create mode 100644 ic/ovn-ic.c > >> > > > > create mode 100644 lib/ovn-inb-idl.ann > >> > > > > create mode 100644 lib/ovn-isb-idl.ann > >> > > > > create mode 100644 ovn-inb.ovsschema > >> > > > > create mode 100644 ovn-inb.xml > >> > > > > create mode 100644 ovn-isb.ovsschema > >> > > > > create mode 100644 ovn-isb.xml > >> > > > > create mode 100644 tests/ovn-ic.at > >> > > > > create mode 100644 tests/ovn-inbctl.at > >> > > > > create mode 100644 tests/ovn-isbctl.at > >> > > > > create mode 100644 utilities/ovn-inbctl.8.xml > >> > > > > create mode 100644 utilities/ovn-inbctl.c > >> > > > > create mode 100644 utilities/ovn-isbctl.8.xml > >> > > > > create mode 100644 utilities/ovn-isbctl.c > >> > > > > > >> > > > > -- > >> > > > > 2.1.0 > >> > > > > > >> > > > > _______________________________________________ > >> > > > > dev mailing list > >> > > > > d...@openvswitch.org > >> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > _______________________________________________ > >> > > dev mailing list > >> > > d...@openvswitch.org > >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev