Thanks, Ivan, Another protocol for group membership worth checking out is RAPID [1] (a recent one). Not sure though if there are any available implementations for it already.
[1] https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <ivanda...@gmail.com>: > Also, here is some interesting reading about gossip, SWIM etc. > > 1 -- > http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf > 2 -- > http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html > 3 -- https://github.com/hashicorp/memberlist (Foundation library of > hashicorp serf) > 4 -- https://github.com/scalecube/scalecube-cluster -- (Java > implementation > of SWIM) > > чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <ivanda...@gmail.com>: > > > >> Friday, Nov 27th work for you? If ok, let's have an open call then. > > Yes, great > > >> As for the protocol port - we will not be dealing with the > > concurrency... > > >>Judging by the Rust port, it seems fairly straightforward. > > Yes, they chose split transport and logic. But original Go package from > > etcd (see raft/node.go) contains some heartbeats mechanism etc. > > I agree with you, this seems not to be a huge deal to port. > > > > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk < > alexey.goncha...@gmail.com > > >: > > > >> Ivan, > >> > >> Agree, let's have a call to discuss the IEP. I have some more thoughts > >> regarding how the replication infrastructure works with > >> atomic/transactional caches, will put this info to the IEP. Does next > >> Friday, Nov 27th work for you? If ok, let's have an open call then. > >> > >> As for the protocol port - we will not be dealing with the concurrency > >> model if we choose this way, this is what I like about their code > >> structure. Essentially, the raft module is a single-threaded automata > >> which > >> has a callback to process a message, process a tick (timeout) and > produces > >> messages that should be sent and log entries that should be persisted. > >> Judging by the Rust port, it seems fairly straightforward. Will be happy > >> to > >> discuss this and other alternatives on the call as well. > >> > >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <ivanda...@gmail.com>: > >> > >> > > Any existing library that can be used to avoid re-implementing the > >> > protocol ourselves? Perhaps, porting the existing implementation to > Java > >> > Personally, I like this idea. Go libraries (either raft module of etcd > >> or > >> > serf by Hashicorp) are famous for clean code, good design, stability, > >> not > >> > enormous size. > >> > But, on other side, Go has different model for concurrency and porting > >> > probably will not be so straightforward. > >> > > >> > > >> > > >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <ivanda...@gmail.com>: > >> > > >> > > I'd suggest to discuss this IEP and technical details in open ZOOM > >> > > meeting. > >> > > > >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <ivanda...@gmail.com > >: > >> > > > >> > >> > >> > >> > >> > >> ---------- Forwarded message --------- > >> > >> От: Ivan Daschinsky <ivanda...@gmail.com> > >> > >> Date: чт, 19 нояб. 2020 г. в 13:02 > >> > >> Subject: Re: IEP-61 Technical discussion > >> > >> To: Alexey Goncharuk <alexey.goncha...@gmail.com> > >> > >> > >> > >> > >> > >> Alexey, let's arise another question. Specifically, how nodes > >> initially > >> > >> find each other (discovery) and how they detect failures. > >> > >> > >> > >> I suppose, that gossip protocol is an ideal candidate. For example, > >> > >> consul [1] uses this approach, using serf [2] library to discover > >> > members > >> > >> of cluster. > >> > >> Then consul forms raft ensemble (server nodes) and client use raft > >> > >> ensemble only as lock service. > >> > >> > >> > >> PacificA suggests internal heartbeats mechanism for failure > >> detection of > >> > >> replicated group, but it says nothing about initial discovery of > >> nodes. > >> > >> > >> > >> WDYT? > >> > >> > >> > >> [1] -- https://www.consul.io/docs/architecture/gossip > >> > >> [2] -- https://www.serf.io/ > >> > >> > >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk < > >> > >> alexey.goncha...@gmail.com>: > >> > >> > >> > >>> Following up the Ignite 3.0 scope/development approach threads, > >> this is > >> > >>> a separate thread to discuss technical aspects of the IEP. > >> > >>> > >> > >>> Let's reiterate one more time on the questions raised by Ivan and > >> also > >> > >>> see if there are any other thoughts on the IEP: > >> > >>> > >> > >>> - *Whether to deploy metastorage on a separate subset of the > >> nodes > >> > >>> or allow Ignite to choose these nodes automatically.* I think > it > >> is > >> > >>> feasible to maintain both modes: by default, Ignite will choose > >> > >>> metastorage nodes automatically which essentially will provide > >> the > >> > same > >> > >>> seamless user experience as TCP discovery SPI - no separate > >> roles, > >> > >>> simplistic deployment. For deployments where people want to > have > >> > more > >> > >>> fine-grained control over the nodes' assignments, we will > >> provide a > >> > runtime > >> > >>> configuration which will allow pinning metastorage group to > >> certain > >> > nodes, > >> > >>> thus eliminating the latency concerns. > >> > >>> - *Whether there are any TLA+ specs for the PacificA protocol.* > >> Not > >> > >>> to my knowledge, but it is known to be used in production by > >> > Microsoft and > >> > >>> other projects, e.g. [1] > >> > >>> > >> > >>> I would like to collect general feedback on the IEP, as well as > >> > feedback > >> > >>> on specific parts of it, such as: > >> > >>> > >> > >>> - Metastorage API > >> > >>> - Any existing library that can be used to avoid > re-implementing > >> the > >> > >>> protocol ourselves? Perhaps, porting the existing > implementation > >> to > >> > Java > >> > >>> (the way TiKV did with etcd-raft [2] [3]? This is a very neat > way > >> > btw in my > >> > >>> opinion because I like the finite automata-like approach of the > >> > replication > >> > >>> module, and, additionally, we could sync bug fixes and > >> improvements > >> > from > >> > >>> the upstream project) > >> > >>> > >> > >>> > >> > >>> Thanks, > >> > >>> --AG > >> > >>> > >> > >>> [1] > >> > >>> > >> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal > >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft > >> > >>> [3] https://github.com/tikv/raft-rs > >> > >>> > >> > >> > >> > >> > >> > >> -- > >> > >> Sincerely yours, Ivan Daschinskiy > >> > >> > >> > >> > >> > >> -- > >> > >> Sincerely yours, Ivan Daschinskiy > >> > >> > >> > > > >> > > > >> > > -- > >> > > Sincerely yours, Ivan Daschinskiy > >> > > > >> > > >> > > >> > -- > >> > Sincerely yours, Ivan Daschinskiy > >> > > >> > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > > > -- > Sincerely yours, Ivan Daschinskiy >