On Wed, May 9, 2018 at 9:02 PM, Han Zhou <zhou...@gmail.com> wrote:

> Hi Numan,
>
> Thanks you so much for the detailed answer! Please see my comments inline.
>
> On Wed, May 9, 2018 at 7:41 AM, Numan Siddique <nusid...@redhat.com>
> wrote:
>
>> Hi Han,
>>
>> Please see below for inline comments
>>
>> On Wed, May 9, 2018 at 5:17 AM, Han Zhou <zhou...@gmail.com> wrote:
>>
>>> Hi Babu/Numan,
>>>
>>> I have a question regarding OVN pacemaker OCF script.
>>> I see in the script MASTER_IP is used to start the active DB and standby
>>> DBs will use that IP to sync from.
>>>
>>> In the Documentation/topics/integration.rst it is also mentioned:
>>>
>>> `master_ip` is the IP address on which the active database server is
>>> expected to be listening, the slave node uses it to connect to the master
>>> node.
>>>
>>> However, since active node will change after failover, I wonder if we
>>> should provide all the IPs of each nodes, and let pacemaker to decide which
>>> IP is the master IP to be used, dynamically.
>>>
>>
>>
>>
>>> I see in the documentation it is mentioned about using the IPAddr2
>>> resource for virtual IP. Does it indicate that we should use the virtual IP
>>> as the master IP?
>>>
>>
>> That is true. If the master ip is not virtual ip, then we will not be
>> able to figure out which is the master node. We need to configure
>> networking-ovn and ovn-controller to point to the right master node so that
>> they can do write transactions on the DB.
>>
>> Below is how we have configured pacemaker OVN HA dbs in tripleo openstack
>> deployment
>>
>>  - Tripleo deployment creates many virtual IPs (using IPAddr2) and these
>> IP addresses are frontend IPs for keystone and all other openstack API
>> services and haproxy is used to load balance the traffic (the deployment
>> will mostly have 3 controllers and all the openstack API services will be
>> running on each node).
>>
>>  - We choose one of the IPaddr2 virtual ip and we set a colocation
>> constraint when creating the OVN pacemaker HA db resource i.e we ask
>> pacemaker to promote the ovsdb-servers running in the node configured with
>> the virtual ip (i.e master_ip).  Pacemaker will call the promote action [1]
>> on the node where master ip is configured.
>>
>> - tripleo configures "ovn_nb_connection=tcp:VIP:6641" and "
>> ovn_sb_connection=tcp:VIP:6642" in neutron.conf and runs "ovs-vsctl set
>> open . external_ids:ovn-remote=tcp:VIP:6642" on all the nodes where
>> ovn-controller service is started.
>>
>> - Suppose the master ip node goes down for some reason. Pacemaker detects
>> this and moves the virtual ip IPAddr2 resource to another node and promotes
>> the ovsdb-servers running on that node to master. This way, the
>> neutron-servers and ovn-controlloers can still talk to the same IP without
>> even noticing that other node becoming master.
>>
>>
>>
>> Since tripleo was using the IPaddr2 model, we thought this would be the
>> better way to have a master/slave HA for ovsdb-servers.
>>
>> However, this may not work in all scenarios, since the virtual IP works
>>> only if it can be routed to all nodes, e.g. when all nodes are on the same
>>> subnet.
>>>
>>
>> You mean you want to create a pacemaker cluster with nodes belonging to
>> different subnets ? I had a chat with the pacemaker folks and this is
>> possible. You can also create a IPAddr2 resource. Pacemaker doesn't put any
>> restrictions. But you need to solve the  reachability of that ip from all
>> the networks/nodes.
>>
>
> Yes, and this is why we can't use IPAddr2 due to the reachability problem.
> (Not in same L2, no BGP, etc.)
>
>
>> In those cases the IPAddr2 virtual IP won't work. In those cases, for the
>>> clients to access the DB, we can use Load-Balancer VIP. But the problem is
>>> still how to set the master_ip and how to make the standby to connect to
>>> the new active after failover.
>>>
>>
>> I am a bit confused here. Your setup will still have the pacemaker
>> cluster right ? Are you talking about having OVN db servers active/passive
>> setup on a non pacemaker cluster setup ? If so, I don't think the OVN OCF
>> script can be used and you have to solve it differently. Correct me if I am
>> wrong here.
>>
>>
> You mentioned above "However, since active node will change after
>> failover, I wonder if we should provide all the IPs of each nodes, and let
>> pacemaker to decide which IP is the master IP to be used, dynamically".
>>
>> We can definitely add this support. Whenever pacemaker promotes a node,
>> other nodes come to know about it and OVN OCF script can configure the
>> ovsdb-servers on the slave nodes to connect to the new master. But how will
>> you configure the neutron-server and ovn-controllers to talk to the new
>> master ?
>> Are you planning to use load balancer IP for this purpose ? What if the
>> load balancer ip resolves to a standby server ?
>>
>
> We still have pacemaker to manage the cluster HA, but just don't use
> IPAddr2 for VIP. To solve the VIP problem, we use physical/soft
> load-balancer. The VIP is on LB rather than bound on the ovn central node
> interface. There is no problem for client, but a little problem on the OCF
> script. Since the OCF script relies on the master IP to start the active
> OVSDB, but the master IP (now LB VIP) is not attached on the node
> interface, this will fail. Now that you explained the usage of master IP, I
> think a small change can solve this problem: don't use master IP when
> starting the active OVSDB service, i.e. listen on 0.0.0.0.
>

This would require changes in OVN OCF script right ? Probably it should be
enhanced such that the existing approach isn't broken (may be with a new
ocf param).



> For standby OVSDBs, they will continue using master IP to sync from
> active. The standby should not listen on any port (or just on different
> port from the active if they have to), so that the LB health-check can
> figure out the active member and point the VIP/master IP correctly to the
> active one.
>
> In addition, how do you configure northd for NB/SB DB? I think both
> master-ip/vip and unix socket should work, but they are different. If using
> master-ip/vip, northd can be active on any one of the nodes, not co-locate
> with NB/SB DBs, and ovsdb named lock ensures only one is active. However,
> it seems we can also use unix socket to always connect to local NB/SB.
> Since NB/SB is managed as a single pacemaker resource, they failover
> together, so we can consider ovn-northd part of the bundle (but not managed
> by pacemaker). This way, although all northds are running, but only the one
> on the active NB/SB node matters, and ovsdb named lock is irrelevant here.
> Any thoughts/experience on this?
>

In the case of openstack we let pacemaker manage it via the OCF script. So
we set manage_northd=yes. ovn-northd will be started only on master node.
Other option i think of is starting ovn-northd on the desired nodes (like
how you would start ovn-controllers) and point ovn-northd to use
tcp:LB_IP:6641/6642 (just like how neutron-server and ovn-controllers are
configured). The ovsdb named lock would make sure that only one is active.



>
> Alternatively, we can also separate pacemaker resource for NB and SB, so
> that each component NB/SB/northd can be active/standby independent for each
> other on different nodes, but I am not sure if there are more benefit or
> churns.
>
>>
>> Hope this helps.
>>
>> If you have a requirement to support this scenario (i.e without master_ip
>> param), it can be done. But care should be taken when implementing it.
>>
>> So far seems we can still use master_ip, but with a little change as
> mentioned above.
>
>>
>> [1] - https://github.com/openvswitch/ovs/blob/master/ovn/
>> utilities/ovndb-servers.ocf#L505
>>        http://www.linux-ha.org/doc/dev-guides/_resource_agent_acti
>> ons.html
>>
>>
>>
>>> I may have missed something here. Could you help explain what's the
>>> expected way to work?
>>>
>>
>>
>>
>>
>>>
>>> Thanks,
>>> Han
>>>
>>
>>
>>
>>
>>
>
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to