Sounds good. Just checked the patch, by default the C IDL has "leader_only" as true, which ensures that connection is to leader only. This is the case for northd. So the lock works for northd hot active-standby purpose if all the ovsdb endpoints of a cluster are specified to northd, since all northds are connecting to the same DB, the leader.
For neutron networking-ovn, this may not work yet, since I didn't see such logic in the python IDL in current patch series. It would be good if we add similar logic for python IDL. (@ben/numan, correct me if I am wrong) On Wed, Mar 21, 2018 at 6:49 PM, aginwala <aginw...@asu.edu> wrote: > Hi : > > Just sorted out the correct settings and northd also works in ha in raft. > > There were 2 issues in the setup: > 1. I had started nb db without --db-nb-create-insecure-remote > 2. I also started northd locally on all 3 without remote which is like all > three northd trying to lock the ovsdb locally. > > Hence, the duplicate logs were populated in the southbound datapath due to > multiple northd trying to write the local copy. > > So, I now start nb db with --db-nb-create-insecure-remote and northd on > all 3 nodes using below command: > > ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db="tcp: > 10.169.125.152:6641,tcp:10.169.125.131:6641,tcp:10.148.181.162:6641" > --ovnsb-db="tcp:10.169.125.152:6642,tcp:10.169.125.131:6642,tcp: > 10.148.181.162:6642" --no-chdir --log-file=/var/log/openvswitch/ovn-northd.log > --pidfile=/var/run/openvswitch/ovn-northd.pid --detach --monitor > > > #At start, northd went active on the leader node and standby on other two > nodes. > > #After old leader crashed and new leader got elected, northd goes active > on any of the remaining 2 nodes as per sample logs below from non-leader > node: > 2018-03-22T00:20:30.732Z|00023|ovn_northd|INFO|ovn-northd lock lost. This > ovn-northd instance is now on standby. > 2018-03-22T00:20:30.743Z|00024|ovn_northd|INFO|ovn-northd lock acquired. > This ovn-northd instance is now active. > > # Also ovn-controller works similar way if leader goes down and connects > to any of the remaining 2 nodes: > 2018-03-22T01:21:56.250Z|00029|ovsdb_idl|INFO|tcp:10.148.181.162:6642: > clustered database server is disconnected from cluster; trying another > server > 2018-03-22T01:21:56.250Z|00030|reconnect|INFO|tcp:10.148.181.162:6642: > connection attempt timed out > 2018-03-22T01:21:56.250Z|00031|reconnect|INFO|tcp:10.148.181.162:6642: > waiting 4 seconds before reconnect > 2018-03-22T01:23:52.417Z|00043|reconnect|INFO|tcp:10.148.181.162:6642: > connected > > > > Above settings will also work if we put all the nodes behind the vip and > updates the ovn configs to use vips. So we don't need pacemaker explicitly > for northd HA :). > > Since the setup is complete now, I will populate the same in scale test > env and see how it behaves. > > @Numan: We can try the same with networking-ovn integration and see if we > find anything weird there too. Not sure if you have any exclusive findings > for this case. > > Let me know if something else is missed here. > > > > > Regards, > > On Wed, Mar 21, 2018 at 2:50 PM, Han Zhou <zhou...@gmail.com> wrote: > >> Ali, sorry if I misunderstand what you are saying, but pacemaker here is >> for northd HA. pacemaker itself won't point to any ovsdb cluster node. All >> northds can point to a LB VIP for the ovsdb cluster, so if a member of >> ovsdb cluster is down it won't have impact to northd. >> >> Without clustering support of the ovsdb lock, I think this is what we >> have now for northd HA. Please suggest if anyone has any other idea. Thanks >> :) >> >> On Wed, Mar 21, 2018 at 1:12 PM, aginwala <aginw...@asu.edu> wrote: >> >>> :) The only thing is while using pacemaker, if the node that pacemaker >>> if pointing to is down, all the active/standby northd nodes have to be >>> updated to new node from the cluster. But will dig in more to see what else >>> I can find. >>> >>> @Ben: Any suggestions further? >>> >>> >>> Regards, >>> >>> On Wed, Mar 21, 2018 at 10:22 AM, Han Zhou <zhou...@gmail.com> wrote: >>> >>>> >>>> >>>> On Wed, Mar 21, 2018 at 9:49 AM, aginwala <aginw...@asu.edu> wrote: >>>> >>>>> Thanks Numan: >>>>> >>>>> Yup agree with the locking part. For now; yes I am running northd on >>>>> one node. I might right a script to monitor northd in cluster so that if >>>>> the node where it's running goes down, script can spin up northd on one >>>>> other active nodes as a dirty hack. >>>>> >>>>> The "dirty hack" is pacemaker :) >>>> >>>> >>>>> Sure, will await for the inputs from Ben too on this and see how >>>>> complex would it be to roll out this feature. >>>>> >>>>> >>>>> Regards, >>>>> >>>>> >>>>> On Wed, Mar 21, 2018 at 5:43 AM, Numan Siddique <nusid...@redhat.com> >>>>> wrote: >>>>> >>>>>> Hi Aliasgar, >>>>>> >>>>>> ovsdb-server maintains locks per each connection and not across the >>>>>> db. A workaround for you now would be to configure all the ovn-northd >>>>>> instances to connect to one ovsdb-server if you want to have >>>>>> active/standy. >>>>>> >>>>>> Probably Ben can answer if there is a plan to support ovsdb locks >>>>>> across the db. We also need this support in networking-ovn as it also >>>>>> uses >>>>>> ovsdb locks. >>>>>> >>>>>> Thanks >>>>>> Numan >>>>>> >>>>>> >>>>>> On Wed, Mar 21, 2018 at 1:40 PM, aginwala <aginw...@asu.edu> wrote: >>>>>> >>>>>>> Hi Numan: >>>>>>> >>>>>>> Just figured out that ovn-northd is running as active on all 3 nodes >>>>>>> instead of one active instance as I continued to test further which >>>>>>> results >>>>>>> in db errors as per logs. >>>>>>> >>>>>>> >>>>>>> # on node 3, I run ovn-nbctl ls-add ls2 ; it populates below logs >>>>>>> in ovn-north >>>>>>> 2018-03-21T06:01:59.442Z|00007|ovsdb_idl|WARN|transaction error: >>>>>>> {"details":"Transaction causes multiple rows in \"Datapath_Binding\" >>>>>>> table >>>>>>> to have identical values (1) for index on column \"tunnel_key\". First >>>>>>> row, with UUID 8c5d9342-2b90-4229-8ea1-001a733a915c, was inserted >>>>>>> by this transaction. Second row, with UUID >>>>>>> 8e06f919-4cc7-4ffc-9a79-20ce6663b683, >>>>>>> existed in the database before this transaction and was not modified by >>>>>>> the >>>>>>> transaction.","error":"constraint violation"} >>>>>>> >>>>>>> In southbound datapath list, 2 duplicate records gets created for >>>>>>> same switch. >>>>>>> >>>>>>> # ovn-sbctl list Datapath >>>>>>> _uuid : b270ae30-3458-445f-95d2-b14e8ebddd01 >>>>>>> external_ids : >>>>>>> {logical-switch="4d6674e3-ff9f-4f38-b050-0fa9bec9e34d", >>>>>>> name="ls2"} >>>>>>> tunnel_key : 2 >>>>>>> >>>>>>> _uuid : 8e06f919-4cc7-4ffc-9a79-20ce6663b683 >>>>>>> external_ids : >>>>>>> {logical-switch="4d6674e3-ff9f-4f38-b050-0fa9bec9e34d", >>>>>>> name="ls2"} >>>>>>> tunnel_key : 1 >>>>>>> >>>>>>> >>>>>>> >>>>>>> # on nodes 1 and 2 where northd is running, it gives below error: >>>>>>> 2018-03-21T06:01:59.437Z|00008|ovsdb_idl|WARN|transaction error: >>>>>>> {"details":"cannot delete Datapath_Binding row >>>>>>> 8e06f919-4cc7-4ffc-9a79-20ce6663b683 because of 17 remaining >>>>>>> reference(s)","error":"referential integrity violation"} >>>>>>> >>>>>>> As per commit message, for northd I re-tried setting --ovnnb-db="tcp: >>>>>>> 10.169.125.152:6641,tcp:10.169.125.131:6641,tcp:10.148.181.162:6641" >>>>>>> and --ovnsb-db="tcp:10.169.125.152:6642,tcp:10.169.125.131:6642,tcp: >>>>>>> 10.148.181.162:6642" and it did not help either. >>>>>>> >>>>>>> There is no issue if I keep running only one instance of northd on >>>>>>> any of these 3 nodes. Hence, wanted to know is there something else >>>>>>> missing here to make only one northd instance as active and rest as >>>>>>> standby? >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> On Thu, Mar 15, 2018 at 3:09 AM, Numan Siddique <nusid...@redhat.com >>>>>>> > wrote: >>>>>>> >>>>>>>> That's great >>>>>>>> >>>>>>>> Numan >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Mar 15, 2018 at 2:57 AM, aginwala <aginw...@asu.edu> wrote: >>>>>>>> >>>>>>>>> Hi Numan: >>>>>>>>> >>>>>>>>> I tried on new nodes (kernel : 4.4.0-104-generic , Ubuntu >>>>>>>>> 16.04)with fresh installation and it worked super fine for both >>>>>>>>> sb and nb dbs. Seems like some kernel issue on the previous nodes >>>>>>>>> when I re-installed raft patch as I was running different ovs version >>>>>>>>> on >>>>>>>>> those nodes before. >>>>>>>>> >>>>>>>>> >>>>>>>>> For 2 HVs, I now set ovn-remote="tcp:10.169.125.152:6642, tcp: >>>>>>>>> 10.169.125.131:6642, tcp:10.148.181.162:6642" and started >>>>>>>>> controller and it works super fine. >>>>>>>>> >>>>>>>>> >>>>>>>>> Did some failover testing by rebooting/killing the leader ( >>>>>>>>> 10.169.125.152) and bringing it back up and it works as expected. >>>>>>>>> Nothing weird noted so far. >>>>>>>>> >>>>>>>>> # check-cluster gives below data one of the node(10.148.181.162) post >>>>>>>>> leader failure >>>>>>>>> >>>>>>>>> ovsdb-tool check-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>> ovsdb-tool: leader /etc/openvswitch/ovnsb_db.db for term 2 has log >>>>>>>>> entries only up to index 18446744073709551615, but index 9 was >>>>>>>>> committed in >>>>>>>>> a previous term (e.g. by /etc/openvswitch/ovnsb_db.db) >>>>>>>>> >>>>>>>>> >>>>>>>>> For check-cluster, are we planning to add more output showing >>>>>>>>> which node is active(leader), etc in upcoming versions ? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks a ton for helping sort this out. I think the patch looks >>>>>>>>> good to be merged post addressing of the comments by Justin along >>>>>>>>> with the >>>>>>>>> man page details for ovsdb-tool. >>>>>>>>> >>>>>>>>> >>>>>>>>> I will do some more crash testing for the cluster along with the >>>>>>>>> scale test and keep you posted if something unexpected is noted. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Mar 13, 2018 at 11:07 PM, Numan Siddique < >>>>>>>>> nusid...@redhat.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Mar 14, 2018 at 7:51 AM, aginwala <aginw...@asu.edu> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Sure. >>>>>>>>>>> >>>>>>>>>>> To add on , I also ran for nb db too using different port and >>>>>>>>>>> Node2 crashes with same error : >>>>>>>>>>> # Node 2 >>>>>>>>>>> /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>> --db-nb-addr=10.99.152.138 --db-nb-port=6641 >>>>>>>>>>> --db-nb-cluster-remote-addr="t >>>>>>>>>>> cp:10.99.152.148:6645" --db-nb-cluster-local-addr="tcp: >>>>>>>>>>> 10.99.152.138:6645" start_nb_ovsdb >>>>>>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnnb_db.db: cannot >>>>>>>>>>> identify file type >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Hi Aliasgar, >>>>>>>>>> >>>>>>>>>> It worked for me. Can you delete the old db files in >>>>>>>>>> /etc/openvswitch/ and try running the commands again ? >>>>>>>>>> >>>>>>>>>> Below are the commands I ran in my setup. >>>>>>>>>> >>>>>>>>>> Node 1 >>>>>>>>>> ------- >>>>>>>>>> sudo /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>> --db-sb-addr=192.168.121.91 --db-sb-port=6642 >>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>> --db-sb-cluster-local-addr=tcp:192.168.121.91:6644 start_sb_ovsdb >>>>>>>>>> >>>>>>>>>> Node 2 >>>>>>>>>> --------- >>>>>>>>>> sudo /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>> --db-sb-addr=192.168.121.87 --db-sb-port=6642 >>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>> --db-sb-cluster-local-addr="tcp:192.168.121.87:6644" >>>>>>>>>> --db-sb-cluster-remote-addr="tcp:192.168.121.91:6644" >>>>>>>>>> start_sb_ovsdb >>>>>>>>>> >>>>>>>>>> Node 3 >>>>>>>>>> --------- >>>>>>>>>> sudo /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>> --db-sb-addr=192.168.121.78 --db-sb-port=6642 >>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>> --db-sb-cluster-local-addr="tcp:192.168.121.78:6644" >>>>>>>>>> --db-sb-cluster-remote-addr="tcp:192.168.121.91:6644" >>>>>>>>>> start_sb_ovsdb >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Numan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Mar 13, 2018 at 9:40 AM, Numan Siddique < >>>>>>>>>>> nusid...@redhat.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Mar 13, 2018 at 9:46 PM, aginwala <aginw...@asu.edu> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks Numan for the response. >>>>>>>>>>>>> >>>>>>>>>>>>> There is no command start_cluster_sb_ovsdb in the source code >>>>>>>>>>>>> too. Is that in a separate commit somewhere? Hence, I used >>>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>>> which I think would not be a right choice? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Sorry, I meant start_sb_ovsdb. Strange that it didn't work for >>>>>>>>>>>> you. Let me try it out again and update this thread. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Numan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> # Node1 came up as expected. >>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.148 --db-sb-port=6642 >>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>> start_sb_ovsdb. >>>>>>>>>>>>> >>>>>>>>>>>>> # verifying its a clustered db with ovsdb-tool >>>>>>>>>>>>> db-local-address /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>> tcp:10.99.152.148:6644 >>>>>>>>>>>>> # ovn-sbctl show works fine and chassis are being populated >>>>>>>>>>>>> correctly. >>>>>>>>>>>>> >>>>>>>>>>>>> #Node 2 fails with error: >>>>>>>>>>>>> /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>>> --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" >>>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnsb_db.db: >>>>>>>>>>>>> cannot identify file type >>>>>>>>>>>>> >>>>>>>>>>>>> # So i did start the sb db the usual way using start_ovsdb to >>>>>>>>>>>>> just get the db file created and killed the sb pid and re-ran the >>>>>>>>>>>>> command >>>>>>>>>>>>> which gave actual error where it complains for join-cluster >>>>>>>>>>>>> command that is >>>>>>>>>>>>> being called internally >>>>>>>>>>>>> /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>>> --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" >>>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>>> ovsdb-tool: /etc/openvswitch/ovnsb_db.db: not a clustered >>>>>>>>>>>>> database >>>>>>>>>>>>> * Backing up database to /etc/openvswitch/ovnsb_db.db.b >>>>>>>>>>>>> ackup1.15.0-70426956 >>>>>>>>>>>>> ovsdb-tool: 'join-cluster' command requires at least 4 >>>>>>>>>>>>> arguments >>>>>>>>>>>>> * Creating cluster database /etc/openvswitch/ovnsb_db.db from >>>>>>>>>>>>> existing one >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> # based on above error I killed the sb db pid again and try >>>>>>>>>>>>> to create a local cluster on node then re-ran the join operation >>>>>>>>>>>>> as per >>>>>>>>>>>>> the source code function. >>>>>>>>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>> OVN_Southbound tcp:10.99.152.138:6644 tcp:10.99.152.148:6644 >>>>>>>>>>>>> which still complains >>>>>>>>>>>>> ovsdb-tool: I/O error: /etc/openvswitch/ovnsb_db.db: create >>>>>>>>>>>>> failed (File exists) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> # Node 3: I did not try as I am assuming the same failure as >>>>>>>>>>>>> node 2 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Let me know may know further. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Mar 13, 2018 at 3:08 AM, Numan Siddique < >>>>>>>>>>>>> nusid...@redhat.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Aliasgar, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Mar 13, 2018 at 7:11 AM, aginwala <aginw...@asu.edu> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Ben/Noman: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am trying to setup 3 node southbound db cluster using >>>>>>>>>>>>>>> raft10 <https://patchwork.ozlabs.org/patch/854298/> in >>>>>>>>>>>>>>> review. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # Node 1 create-cluster >>>>>>>>>>>>>>> ovsdb-tool create-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> /root/ovs-reviews/ovn/ovn-sb.ovsschema tcp: >>>>>>>>>>>>>>> 10.99.152.148:6642 >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> A different port is used for RAFT. So you have to choose >>>>>>>>>>>>>> another port like 6644 for example. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # Node 2 >>>>>>>>>>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> OVN_Southbound tcp:10.99.152.138:6642 tcp:10.99.152.148:6642 >>>>>>>>>>>>>>> --cid >>>>>>>>>>>>>>> 5dfcb678-bb1d-4377-b02d-a380edec2982 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #Node 3 >>>>>>>>>>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> OVN_Southbound tcp:10.99.152.101:6642 tcp:10.99.152.138:6642 >>>>>>>>>>>>>>> tcp:10.99.152.148:6642 --cid 5dfcb678-bb1d-4377-b02d-a380ed >>>>>>>>>>>>>>> ec2982 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # ovn remote is set to all 3 nodes >>>>>>>>>>>>>>> external_ids:ovn-remote="tcp:10.99.152.148:6642, tcp: >>>>>>>>>>>>>>> 10.99.152.138:6642, tcp:10.99.152.101:6642" >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> # Starting sb db on node 1 using below command on node 1: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ovsdb-server --detach --monitor -vconsole:off -vraft >>>>>>>>>>>>>>> -vjsonrpc --log-file=/var/log/openvswitch/ovsdb-server-sb.log >>>>>>>>>>>>>>> --pidfile=/var/run/openvswitch/ovnsb_db.pid >>>>>>>>>>>>>>> --remote=db:OVN_Southbound,SB_Global,connections >>>>>>>>>>>>>>> --unixctl=ovnsb_db.ctl >>>>>>>>>>>>>>> --private-key=db:OVN_Southbound,SSL,private_key >>>>>>>>>>>>>>> --certificate=db:OVN_Southbound,SSL,certificate >>>>>>>>>>>>>>> --ca-cert=db:OVN_Southbound,SSL,ca_cert >>>>>>>>>>>>>>> --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols >>>>>>>>>>>>>>> --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers >>>>>>>>>>>>>>> --remote=punix:/var/run/openvswitch/ovnsb_db.sock >>>>>>>>>>>>>>> /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # check-cluster is returning nothing >>>>>>>>>>>>>>> ovsdb-tool check-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # ovsdb-server-sb.log below shows the leader is elected with >>>>>>>>>>>>>>> only one server and there are rbac related debug logs with rpc >>>>>>>>>>>>>>> replies and >>>>>>>>>>>>>>> empty params with no errors >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2018-03-13T01:12:02Z|00002|raft|DBG|server 63d1 added to >>>>>>>>>>>>>>> configuration >>>>>>>>>>>>>>> 2018-03-13T01:12:02Z|00003|raft|INFO|term 6: starting >>>>>>>>>>>>>>> election >>>>>>>>>>>>>>> 2018-03-13T01:12:02Z|00004|raft|INFO|term 6: elected leader >>>>>>>>>>>>>>> by 1+ of 1 servers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Now Starting the ovsdb-server on the other clusters fails >>>>>>>>>>>>>>> saying >>>>>>>>>>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnsb_db.db: >>>>>>>>>>>>>>> cannot identify file type >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also noticed that man ovsdb-tool is missing cluster details. >>>>>>>>>>>>>>> Might want to address it in the same patch or different. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please advise to what is missing here for running ovn-sbctl >>>>>>>>>>>>>>> show as this command hangs. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I think you can use the ovn-ctl command >>>>>>>>>>>>>> "start_cluster_sb_ovsdb" for your testing (atleast for now) >>>>>>>>>>>>>> >>>>>>>>>>>>>> For your setup, I think you can start the cluster as >>>>>>>>>>>>>> >>>>>>>>>>>>>> # Node 1 >>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.148 --db-sb-port=6642 >>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>> start_cluster_sb_ovsdb >>>>>>>>>>>>>> >>>>>>>>>>>>>> # Node 2 >>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" >>>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>> start_cluster_sb_ovsdb >>>>>>>>>>>>>> >>>>>>>>>>>>>> # Node 3 >>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.101 --db-sb-port=6642 >>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.101:6644" >>>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" start_c >>>>>>>>>>>>>> luster_sb_ovsdb >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let me know how it goes. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> Numan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> discuss mailing list >>>>>>>>>>>>>>> disc...@openvswitch.org >>>>>>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> discuss mailing list >>>>> disc...@openvswitch.org >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >>>>> >>>>> >>>> >>> >> >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss