Happy to contribute to this effort. On Mon, 9 Mar 2020 at 20:25, Clebert Suconic <clebert.suco...@gmail.com> wrote:
> I think we should have a management component, that runs outside of > the broker and would manage quorum. > > That way you can have the quorum running outside of the broker itself. > which would improve the need of multiple brokers to manage the quorum. > You just need Quorum Managers in distinct places. > > I had recently worked with a software called Ceph... And Ceph has the > concept of managers working away from their "broker" (it's not a > broker.. it's a DB, but in a sense it's the same concept here). I > think we should do the same. > > I have talked to Franz and other guys in private.. and it seems > everybody these days mention raft consensus algorithm. Perhaps we > should look into it.. and make it pluggable. I believe there's JRaft > working on top of JGroups already. > > If we make this pluggable and make it a manager a separate process, > this will be a big win IMO. > > On Tue, Mar 3, 2020 at 9:55 AM Andy Taylor <andy.tayl...@gmail.com> wrote: > > > > Personally I wouldn't use Zookeeper, I think there are better options. > Also > > looks like Kafka are replacing it as well. Saying that, it doesn't really > > matter what is used, the main thing is we need to remove the burden of > > providing consensus away from the broker. > > > > It would make sense to make it pluggable. > > > > A > > > > On Mon, 2 Mar 2020 at 19:19, KimmKing <kimmk...@apache.org> wrote: > > > > > Can't agree any more. > > > The HA&Replication is more and more important for a modern messaging > > > system. > > > Other apache opensource mq, kafka/rocketmq/pulsar maybe referred to. > > > > > > > > > And In my experience this also bring some extra-complexity about > > > performance/brainsplitting issues when rebalance data. > > > > > > At 2020-03-02 20:33:31, "Martyn Taylor" <m...@martyntaylor.me> wrote: > > > >I think this is a great idea Franz. The HA and replication components > > > have > > > >been a source of issues over the years. Two problems I see are that > 1) > > > >there isn't a clean separation between the consensus mechanism and the > > > >replication, and 2) the consensus algorithm used in Artemis isn't > based on > > > >any standard algorithm or a research paper. Hence, all the issues > that > > > >were caught over the years due to various edge cases. Integration > with > > > >ZooKeeper seems like the obvious solution (i.e. push the consensus > logic > > > >off to a third party lib). I suspect though, this will be a > considerable > > > >amount of work and is likely to introduce new issues, so I'd proceed > with > > > >caution. > > > > > > > >Cheers > > > > > > > > > > > > > > > >On Mon, Mar 2, 2020 at 8:28 AM nigro_franz <nigro....@gmail.com> > wrote: > > > > > > > >> Hi folks, > > > >> > > > >> especially due to the requirements of the current Artemis quorum > vote > > > >> algorithm, we've thought to re-implementing it with a different > focus in > > > >> mind: > > > >> 1) to make it pluggable (eg by using the many Raft implementations, > > > >> ZooKeeper or others) > > > >> 2) to cleanly separate the election phase and cluster member states > (ie > > > it > > > >> should be the Topology shared between them) > > > >> 3) to simplify most common setups in both amount of configuration > and > > > >> requirements (eg "witness" nodes could be implemented to get a > minimum > > > 2*n > > > >> +1 quorum of nodes instead of forcing 2*n + 1 master-backup pairs) > > > >> 4) [OPTIONALLY] to reduce/eliminate implicit "good practices" in > term of > > > >> order of actions to be performed on nodes in "special states" eg > proper > > > >> restart sequence after failover or similar cases > > > >> 5) [OPTIONALLY] to make shared-store and replication behaviour more > > > >> similar: > > > >> journal's presence should be the only difference between the 2s > > > >> > > > >> A proposal of steps to be followed to get this: > > > >> 1) abstract away the current quorum vote: it requires extra-care > because > > > >> the > > > >> logic is melted together with the replication/clustering behaviour > > > >> 2) refactor it in order to separate election phase and cluster > member > > > >> states > > > >> 3) implement a RI version of such APIs > > > >> > > > >> Post-actions to help people adopt it, but need to be thought > upfront: > > > >> 1) a clean upgrade path for current HA replication users > > > >> 2) deprecate or integrate the current HA replication into the new > > > version > > > >> > > > >> I've opened this here because many of the HA replication users are > devs > > > too > > > >> and given that this isn't yet implemented: we're stull in the > > > >> design/proposal phase, so anyone that want to express their > > > >> ideas/opinions/POC on this, is invited to talk here ;) > > > >> > > > >> Cheers, > > > >> Franz > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> Sent from: > > > >> http://activemq.2283324.n4.nabble.com/ActiveMQ-Dev-f2368404.html > > > >> > > > > > > > -- > Clebert Suconic > -- Regards, Atri Apache Concerted