>>> On 9/9/2008 at 4:27 AM, in message <[EMAIL PROTECTED]>, "Arne Eriksson R" <[EMAIL PROTECTED]> wrote: > Hi, > We have a cluster with 6 processors using openais stable version 0.80.3. > > For some reason our cluster splits up into two rings. > Scenario is: > node1(n1) n2 n3 n4 n5 n6 are in the ring. > > Suddenly the ring splits into two rings: > n1 n2 n3 got leave msg from n4 n5 n6 > n4 n5 n6 got leave msg from n1 n2 n3 > > After a few milliseconds the two rings joins again: > n1 n2 n3 got join msg from n4 n5 n6 > n4 n5 n6 got join msg from n1 n2 n3 > > The two ring is joined to one ring again: > node1(n1) n2 n3 n4 n5 n6 are in the ring. > > The question is if this is a normal scenario from EVS in the openais > implementation? > > The problem is that the application needs to detect the difference > between two kinds of joins: The "normal" join where the two rings/nodes > join for the first time and the "abnormal" joins where a ring has split > and re-joined (without any nodes being restarted). The first case > typically requires only a sync of some nodes (bringing the history up to > date). The second case requires a merger, i.e selection of a loosing > side and the looser discarding the loosers history.
Sidebar: if assuming the presence of a shared disk someplace, then it can be used as a different kind of communication channel; for detecting Split Brain conditions: http://wiki.linux-ha.org/SBD_Fencing The idea is for the partitions to share membership information / detect that a partition exists. Just a thought - hopefully nothing bad happened while the partitions were split - in the second case ;-) Hth, Robert _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais