On Thu, Aug 4, 2016 at 11:14 PM, Russell Bryant <russ...@ovn.org> wrote:

>
>
> On Thu, Aug 4, 2016 at 8:17 PM, Andy Zhou <az...@ovn.org> wrote:
>
>>
>> On Wed, Jul 27, 2016 at 1:04 PM, Andy Zhou <az...@ovn.org> wrote:
>>
>>>
>>>
>>> On Tue, Jul 26, 2016 at 6:20 PM, Russell Bryant <russ...@ovn.org> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou <az...@ovn.org> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant <russ...@ovn.org>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou <az...@ovn.org> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant <russ...@ovn.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou <az...@ovn.org> wrote:
>>>>>>>>
>>>>>>>>> Hi, Rayn and Russell,
>>>>>>>>>
>>>>>>>>
>>>>>>>> Can we move this discussion to the ovs dev mailing list?  Feel free
>>>>>>>> to just add it in a reply if you'd like.
>>>>>>>>
>>>>>>> Done.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> I am wondering how we can actually use the active/backup feature
>>>>>>>>> that is now part of
>>>>>>>>> OVSDB to increase OVN availability.
>>>>>>>>>
>>>>>>>>
>>>>>>>> TO be clear, I haven't actually tried this yet.  I'm only speaking
>>>>>>>> about how I think it should work.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Specifically:
>>>>>>>>>
>>>>>>>>> 1. When the active OVSDB server failed, should the back up server
>>>>>>>>> take over, and allow write transactions? One simpler possibility is to
>>>>>>>>> allow read only access to the backup serve.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The  backup server needs to take over.  It's OK if that requires
>>>>>>>> intervention by an HA manager like Pacemaker.  If we can't make the 
>>>>>>>> passive
>>>>>>>> server take over, I'd say the solution is incomplete.
>>>>>>>>
>>>>>>>
>>>>>>> O.K. make sense.
>>>>>>>
>>>>>>> One possible issue with backup server taking over is "split head".
>>>>>>> In case due to network error, backup server becomes disconnected from 
>>>>>>> the
>>>>>>> active
>>>>>>> server, then we may have both server thinking they are active server
>>>>>>> now.  Does Pacemaker help with solving this issue.
>>>>>>>
>>>>>>
>>>>>> It can, yes.  I would expect Pacemaker to explicitly configure a node
>>>>>> to be either the active or passive node.
>>>>>>
>>>>> Manual switching is more straight forward. I agree.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>> 2. When a crashed active OVSDB server recovers, should it become
>>>>>>>>> the new backup, or it should switch back.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Becoming the new backup is fine.  Again, this can be orchestrated
>>>>>>>> by an HA manager (Pacemaker).
>>>>>>>>
>>>>>>> I am not familiar with pacemaker. Can I assume it can provide a
>>>>>>> correct --sync-from argument (pointing to backup server) when relaunch
>>>>>>> OVSDB server?
>>>>>>>
>>>>>>
>>>>>> Yes.  I'd have to consult with some Pacemaker experts on exactly what
>>>>>> the implementation would look like, but roughly:
>>>>>>
>>>>>> Pacemaker manages services using "OCF Resource Agents", which are
>>>>>> just scripts with a defined set of inputs and outputs for service
>>>>>> management.  I would imagine a Pacemaker cluster being told it must have
>>>>>> exactly 1 active and 1 passive OVSDB service.  When the passive OVSDB
>>>>>> service is started, it would include the "sync-from" argument based on
>>>>>> where the active OVSDB service is currently running.
>>>>>>
>>>>>> We really need to prototype this and document it.  I'm guessing too
>>>>>> much.  Pacemaker is frequently used to manage active/passive HA, though.
>>>>>>
>>>>>> Sounds reasonable,  I will work on ovsdb internal changes to support
>>>>> manual switching, using appctl commands. Then looking into prototyping 
>>>>> with
>>>>> HA systems.  I have not used pacemaker in the past, so it may take some
>>>>> time to ramp up.
>>>>>
>>>>
>>>> I should be able to help.  We need to do this work anyway for
>>>> integration into OpenStack deployment tools.  Let me see if I can get some
>>>> helpful examples to follow.
>>>>
>>>
>>> Thanks for helping out.
>>>
>>> Given that, I now plan to work from bottom up, initially focusing on
>>> ovsdb server changes.
>>>
>>> 1. Add a state in ovsdb-server for it to know whether it is an active
>>> server.  Backup server will not accept any connections.  Server started with
>>> --sync-from argument will be put in the back state by default.
>>>
>>> 2. Add appctl commands to allow manually switch state.
>>>
>>> 3. Add a new table for backup server to register its address and ports.
>>> OVSDB clients can learn about them at run time. Back up server should issue
>>> an
>>> transaction to register its address before issuing the monitoring
>>> request.  This feature is not strictly necessary, and can be pushed to HA
>>> manager,
>>> but having it built into ovsdb-server may make it simpler for
>>> integrationl.
>>>
>>> What do you think?
>>>
>>>
>>>
>> Russell, Would HA manager also manage ovn-controller switch over?
>>
>
> Yes, indirectly.  The way this is typically handled is by using a virtual
> IP that moves to whatever host is currently the master
>
Cool, then ovn-controller does not have to be HA aware.

>
>

>
> --
> Russell Bryant
>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to