I think this is a great idea Franz. The HA and replication components have been a source of issues over the years. Two problems I see are that 1) there isn't a clean separation between the consensus mechanism and the replication, and 2) the consensus algorithm used in Artemis isn't based on any standard algorithm or a research paper. Hence, all the issues that were caught over the years due to various edge cases. Integration with ZooKeeper seems like the obvious solution (i.e. push the consensus logic off to a third party lib). I suspect though, this will be a considerable amount of work and is likely to introduce new issues, so I'd proceed with caution.
Cheers On Mon, Mar 2, 2020 at 8:28 AM nigro_franz <nigro....@gmail.com> wrote: > Hi folks, > > especially due to the requirements of the current Artemis quorum vote > algorithm, we've thought to re-implementing it with a different focus in > mind: > 1) to make it pluggable (eg by using the many Raft implementations, > ZooKeeper or others) > 2) to cleanly separate the election phase and cluster member states (ie it > should be the Topology shared between them) > 3) to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to get a minimum 2*n > +1 quorum of nodes instead of forcing 2*n + 1 master-backup pairs) > 4) [OPTIONALLY] to reduce/eliminate implicit "good practices" in term of > order of actions to be performed on nodes in "special states" eg proper > restart sequence after failover or similar cases > 5) [OPTIONALLY] to make shared-store and replication behaviour more > similar: > journal's presence should be the only difference between the 2s > > A proposal of steps to be followed to get this: > 1) abstract away the current quorum vote: it requires extra-care because > the > logic is melted together with the replication/clustering behaviour > 2) refactor it in order to separate election phase and cluster member > states > 3) implement a RI version of such APIs > > Post-actions to help people adopt it, but need to be thought upfront: > 1) a clean upgrade path for current HA replication users > 2) deprecate or integrate the current HA replication into the new version > > I've opened this here because many of the HA replication users are devs too > and given that this isn't yet implemented: we're stull in the > design/proposal phase, so anyone that want to express their > ideas/opinions/POC on this, is invited to talk here ;) > > Cheers, > Franz > > > > > > -- > Sent from: > http://activemq.2283324.n4.nabble.com/ActiveMQ-Dev-f2368404.html >