Re: [akka-user] quorum-based split brain resolution

Eric Pederson Thu, 08 May 2014 08:14:37 -0700

+1 for adding it to the docs.   It's essential info.

On Wednesday, May 7, 2014 4:13:25 AM UTC-4, drewhk wrote:
>
> That is a really good summary, Roland. This should go into docs or at 
> least a blog post?
>
> -Endre
>
>
> On Tue, May 6, 2014 at 8:28 PM, Akka Team <akka.o...@gmail.com<javascript:>
> > wrote:
>
>> Hi Shikhar,
>>
>> thanks for sharing!
>>
>> There are many possible ways of dealing with partitions in a distributed 
>> system, and it depends very much on the use-case what the best solution is. 
>> One fundamental choice you need to make is whether you can live with split 
>> brain scenarios, or whether you can tolerate unavailability; you cannot 
>> exclude both, no matter what you do. To demonstrate this consider a rule 
>> that a still connected subset of the cluster must have more than N members 
>> in order to continue, which means that it will shut itself down if it has 
>> less than this quorum. Then you can configure your cluster with M nodes 
>> such that N>0.5•M and you can be sure that split-brain scenarios are 
>> excluded. The price is that a three-way split can kill all three parts, 
>> shutting the whole cluster down. The same reasoning applies to referee 
>> schemes (i.e. subset continues if it contains a designated node, in which 
>> case the death of this node will kill the whole cluster, and there are 
>> countless variations on this scheme).
>>
>> OTOH you could have a market place whose foremost requirement is 
>> availability; in this case you will have to tolerate the (temporary) split 
>> into multiple disconnected market places in order to “guarantee” 
>> availability (well, there is no such thing, really).
>>
>> So, when it comes to Akka Cluster in particular, you have a few choices 
>> (and endless refinements):
>>
>>    - implement a auto-downing scheme (using voting or quorum or whatever 
>>    you like) that prevents split brain 
>>    - implement an auto-downing scheme that limits split brain but aims 
>>    at availability (e.g. just downing when up to N nodes are unreachable, 
>> e.g. 
>>    N=1 to allow individual failures)
>>    - implement aggressive auto-downing for high availability while 
>>    tolerating split brain (needs human oversight to guard against edge 
>> cases) 
>>    - do not implement auto-downing and have a 24/7 ops team run the 
>>    system manually (the cost—while significant—may well be justified in some 
>>    cases)
>>    - do not use downing (apart from manually removing failed nodes) and 
>>    rely on CRDTs (or equivalent) for synchronizing state between nodes, so 
>>    that after the partition everything heals back together 
>>
>> I probably forgot some possibilities, but this can serve as a starting 
>> point. And it should explain why Akka does not ship with “sophisticated” 
>> schemes at this point: the use-cases are too diverse. So, for now we cover 
>> only the most primitive ways of handling partitions, and we will eventually 
>> add algorithms which have proven worth their salt (or SLOC) in 
>> production—hopefully with the help of our wonderfully smart community ;-)
>>
>> Regards,
>>
>> Roland
>>
>>
>>
>> On Mon, May 5, 2014 at 8:07 AM, shikhar <shi...@schmizz.net <javascript:>
>> > wrote:
>>
>>> I have been hacking on a discovery plugin for 
>>> elasticsearch<https://github.com/shikhar/eskka> using 
>>> akka cluster and I wanted to add some automated downing, and the 
>>> auto-down-unreachable-after is not really an option since it can lead to 
>>> split brain.
>>>
>>> So I went with the approach of using a quorum of members to determine 
>>> whether the unreachable node should be downed. I'm curious to hear what you 
>>> think of this.
>>>
>>> see 
>>> https://github.com/shikhar/eskka/blob/master/src/main/scala/eskka/QuorumBasedPartitionMonitor.scala
>>>  
>>>
>>> 1. The 
>>> VotingMembers<https://github.com/shikhar/eskka/blob/release-0.1/src/main/scala/eskka/VotingMembers.scala>passed
>>>  in the constructor are the seed nodes. Using seed nodes was just an 
>>> easy choice since they are specified before-hand. So ideally there should 
>>> be 3 or more seed nodes.
>>>
>>> 2. I am using an app-level ping 
>>> layer<https://github.com/shikhar/eskka/blob/master/src/main/scala/eskka/Pinger.scala>on
>>>  top of the UNREACHABLE events. When a ping request to an unreachable 
>>> node, made via the seed nodes "affirmatively times-out" (i.e. they must 
>>> explicitly return a timeout response rather than the ping request timing 
>>> out, so that we don't consider an unreachable seed-node as a voter!), then 
>>> we DOWN that unreachable node. Instead of these app-level pings maybe it 
>>> makes sense to utilize the Akka private[cluster] metadata 
>>> like Reachability.isReachable(observer, node) but I'm not entirely sure of 
>>> the semantics.
>>>
>>> 3. Currently this QuorumBasedPartitionMonitor actor gets started on 
>>> every seed node. So in case a member becomes unreachable, they'd all end up 
>>> trying to arrange for a distributed ping to the unreachable node via one 
>>> another, and possibly downing it. This seems a bit like a thundering herd 
>>> so not ideal. But on the other hand I don't want to use a cluster-singleton 
>>> because this partition resolver is trying to be the layer that allows for 
>>> singleton failover to happen smoothly. I'd love to hear ideas on how to 
>>> handle this better.
>>>
>>> 4. Maybe a generic solution for quorum-based partition resolution should 
>>> be a part of Akka proper/contrib? It seems AutoDown is rarely a good answer.
>>>
>>> -- 
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ: 
>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>> >>>>>>>>>> Search the archives: 
>>> https://groups.google.com/group/akka-user
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to akka-user+...@googlegroups.com <javascript:>.
>>> To post to this group, send email to akka...@googlegroups.com<javascript:>
>>> .
>>> Visit this group at http://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Akka Team
>> Typesafe - The software stack for applications that scale
>> Blog: letitcrash.com
>> Twitter: @akkateam
>>  
>> -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to akka-user+...@googlegroups.com <javascript:>.
>> To post to this group, send email to akka...@googlegroups.com<javascript:>
>> .
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>


-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] quorum-based split brain resolution

Reply via email to