Hi

One comment from Quyen about the readme. Revised paragraph below, please see [Gary].

On 23/01/18 19:06, Gary Lee wrote:
---
  00-README.conf       | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++
  Makefile.am          |  4 +++-
  src/osaf/Makefile.am |  8 ++++++--
  3 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/00-README.conf b/00-README.conf
index a8848e632..6c3cff1dd 100644
--- a/00-README.conf
+++ b/00-README.conf
@@ -662,3 +662,59 @@ on each node, except on the active node. This file 
indicates that a cluster
  reboot is in progress and all nodes needs to delay their start, this to give
  the active a lead.
+Split-Brain Prevention with Consensus Service
+=============================================
+
+OpenSAF implements split-brain prevention by utilizing a consensus service that
+implements a replicated state machine. The consensus service uses quorum to
+prevent state changes in network partitions that don't include more than half
+of the nodes in the cluster. In network partitions containing
+half of the nodes or less, the state is either read-only or unavailable.
+Thus, it is important to keep in mind that the consensus service by itself
+does not prevent the presence of multiple active system


+controller nodes. In the case when the network has been split up into 
partitions
+and the current active system controller no longer has write access to the
+state machine, OpenSAF relies on some additional mechanism like fencing to
+ensure that the current active system controller disappears before a new
+active system controller can be chosen among the nodes that do have write
+access to the replicated state machine. If fencing is not available, the old
+active system controller can detect that it has lost write
+access and step down from its active role.

[Gary] The paragraph above should be changed as we don't currently check write access or partition sizes when fencing.

In the case where the network has been split up into partitions,
OpenSAF relies on some additional mechanism like fencing to
ensure that only one active controller exists among the network partitions.
If fencing is not available, the old active system controller can detect that it has
lost write access and step down from its active role.

+
+The consensus service can be implemented, for example, using the RAFT 
algorithm.
+When using RAFT, there are mainly three possibilities:
+
+1. The RAFT servers run on the same nodes as OpenSAF
+2. The RAFT servers run on a subset of the OpenSAF nodes
+3. The RAFT servers run on an external set of nodes, outside of the
+   OpenSAF cluster
+
+The consensus services relies on a plugin to communicate with a distributed
+key-value store database. This plugin must still function according to the
+API when the network has split up into partitions.
+The plugin interface is defined in src/osaf/consensus/plugins/sample.plugin
+
+An implementation for etcdv2 is provided. It assumes etcd is installed
+and configured on all system controllers. In clusters where
+there are only two system controllers, it is highly recommended to
+configure etcd so it runs on at least three nodes to facilitate
+a majority vote with failure tolerance.
+
+Other implementations of a distributed key-value store service
+can be used, provided as it implements the interface documented in 
sample.plugin
+
+To enable split-brain prevention, edit fmd.conf and update accordingly:
+
+export FMS_SPLIT_BRAIN_PREVENTION=1
+export FMS_KEYVALUE_STORE_PLUGIN_CMD=/usr/local/lib/opensaf/etcd.plugin
+
+As discussed, the key-value store does not need to reside on the same nodes
+as OpenSAF. In such a configuration, an appropriate plugin that handles
+the communication with a remotely located key-value store, must be provided.
+
+If remote fencing is enabled, then it will be used to fence a node that the
+consensus service believes should not be active. Otherwise, rded/amfd will
+initiate a 'self-fencing' by rebooting the node, if it determines the node
+should no longer be active according to the consensus service, to prevent
+a split-brain situation.
+
diff --git a/Makefile.am b/Makefile.am
index bcfd844cd..57c2585a8 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -159,7 +159,9 @@ dist_osaf_execbin_SCRIPTS += \
        $(top_srcdir)/scripts/opensaf_reboot \
        $(top_srcdir)/scripts/opensaf_sc_active \
        $(top_srcdir)/scripts/opensaf_scale_out \
-       $(top_srcdir)/scripts/plm_scale_out
+       $(top_srcdir)/scripts/plm_scale_out \
+       $(top_srcdir)/src/osaf/consensus/plugins/etcd.plugin
+# TODO remove above line before pushing
include $(top_srcdir)/src/ais/Makefile.am
  include $(top_srcdir)/src/base/Makefile.am
diff --git a/src/osaf/Makefile.am b/src/osaf/Makefile.am
index 05b78c988..10bbe427b 100644
--- a/src/osaf/Makefile.am
+++ b/src/osaf/Makefile.am
@@ -16,7 +16,9 @@
noinst_HEADERS += \
        src/osaf/immutil/immutil.h \
-       src/osaf/saflog/saflog.h
+       src/osaf/saflog/saflog.h \
+       src/osaf/consensus/keyvalue.h \
+       src/osaf/consensus/service.h
pkglib_LTLIBRARIES += lib/libosaf_common.la @@ -33,7 +35,9 @@ lib_libosaf_common_la_LDFLAGS = \ lib_libosaf_common_la_SOURCES = \
        src/osaf/immutil/immutil.c \
-       src/osaf/saflog/saflog.c
+       src/osaf/saflog/saflog.c \
+       src/osaf/consensus/keyvalue.cc \
+       src/osaf/consensus/service.cc
nodist_EXTRA_lib_libosaf_common_la_SOURCES = dummy.cc


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to