> Apache ZooKeeper is used for a number of different things in Mesos, with > only leader election being customisable with modules. Your existing modular > functionality is insufficient for decoupling from Apache ZooKeeper.
Can you clarify which other functionality you're referring to? Mesos only relies on ZK for leader election and detection. We do have some libraries available in the code for storing the registry in ZK but we do not support that currently. On Thu, Jun 11, 2020 at 11:02 PM Samuel Marks <sam...@offscale.io> wrote: > Apache ZooKeeper is used for a number of different things in Mesos, with > only leader election being customisable with modules. Your existing modular > functionality is insufficient for decoupling from Apache ZooKeeper. > > We are ready and waiting to develop here. > > As mentioned over our off-mailing-list communiqué: > > The main advantages—and reasoning—for my investment into Mesos has been > [the prospect of]: > > - Making it performant and low-resource utilising on a very small number > of nodes… potentially even down to 1 node so that it can 'compete' with > Docker Compose. > - Reducing the number of distributed systems that all do the same thing > in a datacentre environment. > - Postgres has its own consensus, Docker—e.g, via Kubernetes or > Compose—has its own consensus, ZooKeeper has its own consensus, other > things like distributed filesystems… they too; have their own > consensus. > - The big sell from that first point is actually showing people how to > run Mesos and use it for their regular day-to-day development, e.g.: > 1. Context switching when the one engineer is on multiple projects > 2. …then use the same technology at scale. > - The big sell from that second point is to reduce the network traffic, > speed up each systems consensus—through all using the one system—and > simplify analytics. > > This would be a big deal for your bigger clients, who can easily > quantify what this network traffic costs, and what a reduction in > network > traffic with a corresponding increase in speed would mean. > > Eventually this will mean that Ops people can tradeoff guarantees for > speed (and vice-versa). > - Supporting ZooKeeper, Consul, and etcd is just the start. > - Supporting Mesos is just the start. > - We plan on adding more consensus-guaranteeing systems—maybe even our > own Paxos and Raft—and adding this to systems in the Mesos ecosystem > like > Chronos, Marathon, and Aurora. > It is my understanding that a big part of Mesosphere's rebranding is > Kubernetes related. > > Recently—well, just before COVID19!—I spoke at the Sydney Kubernetes Meetup > at Google. They too—including Google—were excited by the prospect of > removing etcd as a hard-dependency, and supporting all the different ones > liboffkv supports. > > I have the budget, team, and expertise at the ready to invest and > contribute these changes. If there are certain design patterns and > refactors you want us to commit to along the way, just say the word. > > Excitedly yours, > > Samuel Marks > Charity <https://sydneyscientific.org> | consultancy <https://offscale.io> > | open-source <https://github.com/offscale> | LinkedIn > <https://linkedin.com/in/samuelmarks> > > > On Wed, Jun 10, 2020 at 1:42 AM Benjamin Mahler <bmah...@apache.org> > wrote: > > > AndreiS just reminded me that we have module interfaces for the master > > detector and contender: > > > > > > > https://github.com/apache/mesos/blob/1.9.0/include/mesos/module/detector.hpp > > > > > https://github.com/apache/mesos/blob/1.9.0/include/mesos/module/contender.hpp > > > > > > > https://github.com/apache/mesos/blob/1.9.0/include/mesos/master/detector.hpp > > > > > https://github.com/apache/mesos/blob/1.9.0/include/mesos/master/contender.hpp > > > > These should allow you to implement the integration with your library, we > > may need to adjust the interfaces a little, but this will let you get > what > > you need done without the burden on us to shepherd the work. > > > > On Fri, May 22, 2020 at 8:38 PM Samuel Marks <sam...@offscale.io> wrote: > > > > > Following on from the discussion on GitHub and here on the > mailing-list, > > > here is the proposal from me and my team: > > > ------------------------------ > > > > > > Choice of approach > > > > > > The “mediator” of every interaction with ZooKeeper in Mesos is the > > > ZooKeeper > > > class, declared in include/mesos/zookeeper/zookeeper.hpp. > > > > > > Of note are the following two differences in the *styles* of API > provided > > > by ZooKeeper class and liboffkv: > > > > > > - > > > > > > Push-style mechanism of notifications on changes in “watched” data, > > > versus pull-style one in liboffkv. In Mesos, the notifications are > > > delivered via the Watcher interface, defined in the same file as > > > ZooKeeper. This interface has the process method, which is invoked > by > > an > > > instance of ZooKeeper at most once for each watch. There is also a > > > special event which informs the watcher that the connection has been > > > dropped. An optional instance of Watcher is passed to the > constructor > > of > > > ZooKeeper. > > > - > > > > > > Asynchronous session establishment process in ZooKeeper versus > > > synchronous one (if at all — e.g. for Consul there is no concept of > > > “session” currently defined at all) in liboffkv. > > > > > > The two users of the ZooKeeper are: > > > > > > 1. > > > > > > GroupProcess; > > > 2. > > > > > > ZooKeeperStorageProcess. > > > > > > We will thus evaluate the possible approaches of integrating liboffkv > > into > > > Mesos through the prism of details of their usage. > > > > > > The two possible approaches are: > > > > > > 1. > > > > > > Replace all usages of ZooKeeper with liboffkv-specific code under > > #ifdef > > > guards. > > > > > > This approach would scale badly, as alternative liboffkv-specific > > > implementations will be needed for both of the users. > > > > > > Moreover, we think that conditional compilation results in > maintenance > > > nightmare; see, e.g.: > > > - > > > > > > RealWaitForChar() in vim <https://geoff.greer.fm/vim/>; > > > - > > > > > > “#ifdef Considered Harmful, or Portability Experience With C > News” > > > paper by Henry Spencer and Geoff Collyer > > > <http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful.pdf > >. > > > > > > The creators of the C programming language, which introduced the > > concept > > > in the first place, have also spoken against conditional > compilation: > > > - > > > > > > In “The Practice of Programming” by Brian W. Kernighan and Rob > > Pike, > > > the following advice is given: “Avoid conditional compilation. > > > Conditional > > > compilation with #ifdef and similar preprocessor directives is > hard > > > to manage, because information tends to get sprinkled throughout > > the > > > source.” > > > - > > > > > > In “Plan 9 from Bell Labs” paper by Rob Pike, Ken Thompson et al. > > > <https://pdos.csail.mit.edu/archive/6.824-2012/papers/plan9.pdf > >, > > > the > > > following is said: “Conditional compilation, even with #ifdef, is > > > used sparingly in Plan 9. The only architecture-dependent #ifdefs > > in > > > the system are in low-level routines in the graphics library. > > > Instead, we > > > avoid such dependencies or, when necessary, isolate them in > > > separate source > > > files or libraries. Besides making code hard to read, #ifdefs > make > > it > > > impossible to know what source is compiled into the binary or > > whether > > > source protected by them will compile or work properly. They > > > make it harder > > > to maintain software.” > > > 2. > > > > > > Modify the *implementation* of the ZooKeeper class to use liboffkv, > > > possibly renaming the class to something akin to KvClient to reflect > > the > > > fact that would no longer be ZooKeeper-specific (this also includes > > the > > > renaming of error codes and other similar nomenclature). The old > > > version of > > > the implementation would be put under an #ifdef guard, thus > minimising > > > the number — and maintenance impact — of #ifdefs. > > > > > > Naturally there are some advantages to taking the ifdef approach, > namely > > > that we can guarantee no difference in builds between before offscale's > > > contribution and after, unless a compiler flag is provided. > > > > > > However to avoid polluting the code, we are recommending the second > > > approach. > > > Incompatibilities > > > > > > The following is the list of incompatibilities between the interfaces > of > > > ZooKeeper class and liboffkv. Some of those features should be > > implemented > > > in liboffkv; others should be emulated inside the ZooKeeper/KvClient > > class; > > > and for others still, the change of the interface of ZooKeeper/KvClient > > is > > > the preferred solution. > > > > > > - > > > > > > Asynchronous session establishment. We propose to emulate this > through > > > spawning a new thread in the constructor of ZooKeeper/KvClient. > > > - > > > > > > Push-style watch notification API. We propose to emulate this > through > > > spawning a new thread for each watch; such a thread would then do > the > > > wait > > > and then invoke watcher->process() under a mutex. The number of > > threads > > > should not be a concern here, as the only user that uses watches at > > all > > > ( > > > GroupProcess) only registers at most one watch. > > > - > > > > > > Multiple servers in URL string. We propose to implement this in > > > liboffkv. > > > - > > > > > > Authentication. We propose to implement this in liboffkv. > > > - > > > > > > ACLs (access control lists). The following ACLs are in fact used for > > > everything: > > > > > > _auth.isSome() > > > ? zookeeper::EVERYONE_READ_CREATOR_ALL > > > : ZOO_OPEN_ACL_UNSAFE > > > > > > We thus propose to: > > > 1. > > > > > > implement rudimentary support for ACLs in liboffkv in the form of > > an > > > optional parameter to create(), > > > > > > bool protect_modify = false > > > > > > 2. > > > > > > change the interface of ZooKeeper/KvClient so that protect_modify > > > flag is used instead of ACLs. > > > - > > > > > > Configurable session timeout. We propose to implement this in > > liboffkv. > > > - > > > > > > Getting the actual session timeout, which might differ from the > > > user-provided as a result of timeout negotiation with server. We > > > propose to > > > implement this in liboffkv. > > > - > > > > > > Getting the session ID. We propose to implement this in liboffkv, > with > > > session ID being std::string; and to modify the interface > accordingly. > > > It is possible to hash a string into a 64-bit number, but in the > > > circumstances given, we think it is just not worth it. > > > - > > > > > > Getting the status of the connection to the server. We propose to > > > implement this in liboffkv. > > > - > > > > > > Sequenced nodes. We propose to emulate this in the class. Here is > the > > > pseudo-code of our solution: > > > > > > while (true) { > > > [counter, version] = get("/counter") > > > seqnum = counter + 1 > > > name = "label" + seqnum > > > try { > > > commit { > > > check "/counter" version, > > > set "/counter" seqnum, > > > create name value > > > } > > > break > > > } catch (TxnAborted) {} > > > } > > > > > > - > > > > > > “Recursive” creation of each parent in create(), akin to mkdir -p. > > This > > > is already emulated in the class, as ZooKeeper does not natively > > support > > > it; we propose to extend this emulation to work with liboffkv. > > > - > > > > > > The semantics of the “set” operation if the entry does not exist: > > > ZooKeeper fails with ZNONODE in this case, while liboffkv creates a > > new > > > node. We propose to emulate this in-class with a transaction. > > > - > > > > > > The semantics of the “erase” operation: ZooKeeper fails with > ZNOTEMPTY > > > if node has children, while liboffkv removes the subtree > recursively. > > As > > > neither of users ever attempts to remove node with children, we > > propose > > > to > > > change the interface so that it declares (and actually implements) > the > > > liboffkv-compatible semantics. > > > - > > > > > > Return of ZooKeeper-specific Stat structures instead of just > versions. > > > As both users only use the version field of this structure, we > propose > > > to > > > simply alter the interface so that only the version is returned. > > > - > > > > > > Explicit “session drop” operation that also immediately erases all > the > > > “leased” nodes. We propose to implement this in liboffkv. > > > - > > > > > > Check if the node being created has leased parent. Currently, > liboffkv > > > declares this to be unspecified behavior: it may either throw (if > > > ZooKeeper > > > is used as the back-end) or successfully create the node > (otherwise). > > As > > > neither of users ever attempts to create such a node, we propose to > > > leave > > > this as is. > > > > > > Estimates > > > We estimate that—including tests—this will be ready by the end of next > > > month. > > > ------------------------------ > > > > > > Open to alternative suggestions, otherwise we'll begin. > > > Samuel Marks > > > Charity <https://sydneyscientific.org> | consultancy < > > https://offscale.io> > > > | open-source <https://github.com/offscale> | LinkedIn > > > <https://linkedin.com/in/samuelmarks> > > > > > > > > > On Sat, May 2, 2020 at 4:04 AM Benjamin Mahler <bmah...@apache.org> > > wrote: > > > > > > > So it sounds like: > > > > > > > > Zookeeper: Official C library has an async API. Are we gaining a lot > > with > > > > the third party C++ wrapper you pointed to? Maybe it "just works", > but > > it > > > > looks very inactive and it's hard to tell how maintained it is. > > > > > > > > Consul: No official C or C++ library. Only some third party C++ ones > > that > > > > look pretty inactive. The ppconsul one you linked to does have an > issue > > > > about an async API, I commented on it: > > > > https://github.com/oliora/ppconsul/issues/26. > > > > > > > > etcd: Can use gRPC c++ client async API. > > > > > > > > Since 2 of 3 provide an async API already, I would lean more towards > an > > > > async API so that we don't have to change anything with the mesos > code > > > when > > > > the last one gets an async implementation. However, we currently use > > the > > > > synchronous ZK API so I realize this would be more work to first > adjust > > > the > > > > mesos code to use the async zookeeper API. I agree that a synchronous > > > > interface is simpler to start with since that will be an easier > > > integration > > > > and we currently do not perform many concurrent operations (and > > probably > > > > won't anytime soon). > > > > > > > > On Sun, Apr 26, 2020 at 11:17 PM Samuel Marks <sam...@offscale.io> > > > wrote: > > > > > > > > > In terms of asynchronous vs synchronous interfacing, when we > started > > > > > liboffkv, it had an asynchronous interface. Then we decided to drop > > it > > > > and > > > > > implemented a synchronous one, due to the dependent libraries which > > > > > liboffkv uses under the hood. > > > > > > > > > > Our ZooKeeper implementation uses the zookeeper-cpp library > > > > > <https://github.com/tgockel/zookeeper-cpp>—a well-maintained C++ > > > wrapper > > > > > around common Zookeeper C bindings [which we contributed to vcpkg > > > > > <https://github.com/microsoft/vcpkg/pull/7001>]. It has an > > > asynchronous > > > > > interface based on std::future > > > > > <https://en.cppreference.com/w/cpp/thread/future>. Since > std::future > > > > does > > > > > not provide chaining or any callbacks, a Zookeeper-specific result > > > cannot > > > > > be asynchronously mapped to liboffkv result. In early versions of > > > > liboffkv > > > > > we used thread pool to do the mapping. > > > > > > > > > > Consul implementation is based on the ppconsul > > > > > <https://github.com/oliora/ppconsul> library [which we contributed > > to > > > > > vcpkg > > > > > < > > > > > > > > > > > > > > > https://github.com/microsoft/vcpkg/pulls?q=is%3Apr+author%3ASamuelMarks+ppconsul > > > > > >], > > > > > which in turn utilizes libcurl <https://curl.haxx.se/libcurl>. > > > > > Unfortunately, ppconsul uses libcurl's easy interface, and > > consequently > > > > it > > > > > is synchronous by design. Again, in the early version of the > library > > we > > > > > used a thread pool to overcome this limitation. > > > > > > > > > > As for etcd, we autogenerated the gRPC C++ client > > > > > <https://github.com/offscale/etcd-client-cpp> [which we > contributed > > to > > > > > vcpkg > > > > > <https://github.com/microsoft/vcpkg/pull/6999>]. gRPC provides an > > > > > asynchronous interface, so a "fair" async client can be implemented > > on > > > > top > > > > > of it. > > > > > > > > > > To sum up, the chosen toolkit provided two of three implementations > > > > require > > > > > thread pool. After careful consideration, we have preferred to give > > the > > > > > user control over threading and opted out of the asynchrony. > > > > > > > > > > Nevertheless, there are some options. zookeeper-cpp allows building > > > with > > > > > custom futures/promises, so we can create a custom build to use in > > > > > liboffkv/Mesos. Another variant is to use plain C ZK bindings > > > > > < > > > > > > > > > > > > > > > https://gitbox.apache.org/repos/asf?p=zookeeper.git;a=tree;f=zookeeper-client/zookeeper-client-c;h=c72b57355c977366edfe11304067ff35f5cf215d;hb=HEAD > > > > > > > > > > > instead of the C++ library. > > > > > As for the Consul client, the only meaningful option is to opt out > of > > > > using > > > > > ppconsul and operate through libcurl's multi interface. > > > > > > > > > > At this point implementing asynchronous interfaces will require > > > rewriting > > > > > liboffkv from the ground up. I can allocate the budget for doing > > this, > > > > as I > > > > > have done to date. However, it would be good to have some more > > > > > back-and-forth before reengaging. > > > > > > > > > > Design Doc: > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1NOfyt7NzpMxxatdFs3f9ixKUS81DHHDVEKBbtVfVi_0 > > > > > [feel free to add it to > > > > > http://mesos.apache.org/documentation/latest/design-docs/] > > > > > > > > > > Thanks, > > > > > > > > > > *SAMUEL MARKS* > > > > > Sydney Medical School | Westmead Institute for Medical Research | > > > > > https://linkedin.com/in/samuelmarks > > > > > Director | Sydney Scientific Foundation Ltd < > > > > https://sydneyscientific.org> > > > > > | Offscale.io of Sydney Scientific Pty Ltd <https://offscale.io> > > > > > > > > > > PS: Damien - not against contributing to FoundationDB, but > priorities > > > are > > > > > Mesos and the Mesos ecosystem, followed by Kuberentes and its > > > ecosystem. > > > > > > > > > > On Tue, Apr 21, 2020 at 3:19 AM Benjamin Mahler < > bmah...@apache.org> > > > > > wrote: > > > > > > > > > > > Samuel: One more thing I forgot to mention, we would prefer to > use > > an > > > > > > asynchronous client interface rather than a synchronous one. Is > > that > > > > > > something you have considered? > > > > > > > > > > > > On Fri, Apr 17, 2020 at 6:11 PM Vinod Kone <vinodk...@apache.org > > > > > > wrote: > > > > > > > > > > > > > Hi Samuel, > > > > > > > > > > > > > > Thanks for showing interest in contributing to the project. > > Having > > > > > > > optionality between ZooKeeper and Etcd would be great for the > > > project > > > > > and > > > > > > > something that has been brought up a few times before, as you > > > noted. > > > > > > > > > > > > > > I echo everything that BenM said. As part of the design it > would > > be > > > > > great > > > > > > > to see the migration path for users currently using Mesos with > > > > > ZooKeeper > > > > > > to > > > > > > > Etcd. Ideally, the migration can happen without much user > > > > intervention. > > > > > > > > > > > > > > Additionally, from our past experience, efforts like these are > > more > > > > > > > successful if the people writing the code have experience with > > how > > > > > things > > > > > > > work in Mesos code base. So I would recommend starting small, > > maybe > > > > > have > > > > > > a > > > > > > > few engineers work on a couple "newbie" tickets and do some > small > > > > > > projects > > > > > > > and have those committed to the project. That gives the > > committers > > > > some > > > > > > > level of confidence about quality of the code and be more open > to > > > > > bigger > > > > > > > changes like etcd integration. It would also help contributors > > get > > > a > > > > > > better > > > > > > > feeling for the lay of the land and see if they are truly > > > interested > > > > in > > > > > > > maintaining this piece of integration for the long haul. This > is > > a > > > > bit > > > > > > of a > > > > > > > longer path but I think it would be more a fruitful one. > > > > > > > > > > > > > > Looking forward to seeing new contributions to Mesos including > > the > > > > > above > > > > > > > design! > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > On Fri, Apr 17, 2020 at 4:52 PM Samuel Marks < > sam...@offscale.io > > > > > > > > wrote: > > > > > > > > > > > > > > > Happy to build a design doc, > > > > > > > > > > > > > > > > To answer your question on what Offscale.io is, it's my > > software > > > > and > > > > > > > > biomedical engineering consultancy. Currently it's still > rather > > > > > small, > > > > > > > with > > > > > > > > only 8 engineers, but I'm expecting & preparing to grow > > rapidly. > > > > > > > > > > > > > > > > My philosophy is always open-source and patent-free, so > that's > > > what > > > > > my > > > > > > > > consultancy—and for that matter, the charitable research > that I > > > > fund > > > > > > > > through it <https://sydneyscientific.org>—follows. > > > > > > > > > > > > > > > > The goal of everything we create is: interoperable > > > (cross-platform, > > > > > > > > cross-technology, cross-language, multi-cloud); open-source > > > > > (Apache-2.0 > > > > > > > OR > > > > > > > > MIT); with a view towards scaling: > > > > > > > > > > > > > > > > - teams; > > > > > > > > - software-development <https://compilers.com.au>; > > > > > > > > - infrastructure [this proposed Mesos contribution + our > > > DevOps > > > > > > > > tooling]; > > > > > > > > - [in the charity's case] facilitating very large-scale > > > medical > > > > > > > > diagnostic screening. > > > > > > > > > > > > > > > > Technologies like Mesos we expect to both optimise resource > > > > > > > > allocation—reducing costs and increasing data locality—and > > award > > > us > > > > > > > > 'bragging rights' with which we can gain clients that are > > already > > > > > using > > > > > > > > Mesos (which, from my experience, is always big corporates… > > > though > > > > > > > > hopefully contributions like these will make it attractive to > > > small > > > > > > > > companies also). > > > > > > > > > > > > > > > > So no, we're not going anywhere, and are planning to maintain > > > this > > > > > > > library > > > > > > > > into the future > > > > > > > > > > > > > > > > PS: Once accepted by Mesos, we'll be making similar > > contributions > > > > to > > > > > > > other > > > > > > > > Mesos ecosystem projects like Chronos < > > > > > https://mesos.github.io/chronos > > > > > > >, > > > > > > > > Marathon <https://github.com/mesosphere/marathon>, and > Aurora > > > > > > > > <https://github.com/aurora-scheduler/aurora> as well as to > > > > unrelated > > > > > > > > projects (e.g., removing etcd as a hard-dependency from > > > Kubernetes > > > > > > > > <https://kubernetes.io>… enabling them to choose between > > > > ZooKeeper, > > > > > > > etcd, > > > > > > > > and Consul). > > > > > > > > > > > > > > > > Thanks for your continual feedback, > > > > > > > > > > > > > > > > *SAMUEL MARKS* > > > > > > > > Sydney Medical School | Westmead Institute for Medical > > Research | > > > > > > > > https://linkedin.com/in/samuelmarks > > > > > > > > Director | Sydney Scientific Foundation Ltd < > > > > > > > https://sydneyscientific.org> > > > > > > > > | Offscale.io of Sydney Scientific Pty Ltd < > > https://offscale.io> > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 18, 2020 at 6:58 AM Benjamin Mahler < > > > > bmah...@apache.org> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Oh ok, could you tell us a little more about how you're > using > > > > > Mesos? > > > > > > > And > > > > > > > > > what offscale.io is? > > > > > > > > > > > > > > > > > > Strictly speaking, we don't really need packaging and > > releases > > > as > > > > > we > > > > > > > can > > > > > > > > > bundle the dependency in our repo and that's what we do for > > > many > > > > of > > > > > > our > > > > > > > > > dependencies. > > > > > > > > > To me, the most important thing is the commitment to > maintain > > > the > > > > > > > library > > > > > > > > > and address issues that come up. > > > > > > > > > I also would lean more towards a run-time flag rather than > a > > > > build > > > > > > > level > > > > > > > > > flag, if possible. > > > > > > > > > > > > > > > > > > I think the best place to start would be to put together a > > > design > > > > > > doc. > > > > > > > > The > > > > > > > > > act of writing that will force the author to think through > > the > > > > > > details > > > > > > > > (and > > > > > > > > > there are a lot of them!), and we'll then get a chance to > > give > > > > > > > feedback. > > > > > > > > > You can look through the mailing list for past examples of > > > design > > > > > > docs > > > > > > > > (in > > > > > > > > > terms of which sections to include, etc). > > > > > > > > > > > > > > > > > > How does that sound? > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2020 at 8:44 PM Samuel Marks < > > > sam...@offscale.io > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Dear Benjamin Mahler [and *Developers mailing-list for > > Apache > > > > > > > Mesos*], > > > > > > > > > > > > > > > > > > > > Thanks for responding so quickly. > > > > > > > > > > > > > > > > > > > > Actually this entire project I invested—time & money, > > > > including a > > > > > > > > > > development team—explicitly in order to contribute this > to > > > > Apache > > > > > > > > Mesos. > > > > > > > > > So > > > > > > > > > > no releases yet, because I wanted to ensure it was up to > > the > > > > > > > > > specification > > > > > > > > > > requirements referenced in dev@mesos.apache.org before > > > > > proceeding > > > > > > > with > > > > > > > > > > packaging and releases. > > > > > > > > > > > > > > > > > > > > Tests have been setup in Travis CI for Linux (Ubuntu > 18.04) > > > and > > > > > > > macOS, > > > > > > > > > > happy to set them up elsewhere also. There are also some > > > > Windows > > > > > > > builds > > > > > > > > > > that need a bit of tweaking, then they will be pushed > into > > CI > > > > > also. > > > > > > > We > > > > > > > > > are > > > > > > > > > > just starting to do some work on reducing build & test > > times. > > > > > > > > > > > > > > > > > > > > Would be great to build a checklist of things you want to > > see > > > > > > before > > > > > > > we > > > > > > > > > > send the PR, e.g., > > > > > > > > > > > > > > > > > > > > - ☐ hosted docs; > > > > > > > > > > - ☐ CI/CD—including packaging—for Windows, Linux, and > > > macOS; > > > > > > > > > > - ☐ releases on GitHub; > > > > > > > > > > - ☐ consistent session and auth interface > > > > > > > > > > - ☐ different tests [can you expand here?] > > > > > > > > > > > > > > > > > > > > This is just an example checklist, would be best if you > and > > > > > others > > > > > > > can > > > > > > > > > > flesh it out, so when we do send the PR it's in an > > > immediately > > > > > > > mergable > > > > > > > > > > state. > > > > > > > > > > > > > > > > > > > > BTW: Originally had a debate with my team about whether > to > > > > send a > > > > > > PR > > > > > > > > out > > > > > > > > > of > > > > > > > > > > the blue—like Microsoft famously did for Node.js > > > > > > > > > > <https://github.com/nodejs/node/pull/4765>—or start an > > > *offer > > > > > > > thread* > > > > > > > > on > > > > > > > > > > the developers mailing-list. > > > > > > > > > > > > > > > > > > > > Looking forward to contributing 🦀 > > > > > > > > > > > > > > > > > > > > *SAMUEL MARKS* > > > > > > > > > > Sydney Medical School | Westmead Institute for Medical > > > > Research | > > > > > > > > > > https://linkedin.com/in/samuelmarks > > > > > > > > > > Director | Sydney Scientific Foundation Ltd < > > > > > > > > > https://sydneyscientific.org> > > > > > > > > > > | Offscale.io of Sydney Scientific Pty Ltd < > > > > https://offscale.io> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2020 at 2:38 AM Benjamin Mahler < > > > > > > bmah...@apache.org> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for reaching out, a well maintained and well > > written > > > > > > wrapper > > > > > > > > > > > interface to the three backends would certainly make > this > > > > > easier > > > > > > > for > > > > > > > > us > > > > > > > > > > vs > > > > > > > > > > > implementing such an interface ourselves. > > > > > > > > > > > > > > > > > > > > > > Is this the client interface? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/offscale/liboffkv/blob/d31181a1e74c5faa0b7f5d7001879640b4d9f111/liboffkv/client.hpp#L115-L142 > > > > > > > > > > > > > > > > > > > > > > At a quick glance, three ZK things that we rely on but > > seem > > > > to > > > > > be > > > > > > > > > absent > > > > > > > > > > > from the common interface is the ZK session, > > > authentication, > > > > > and > > > > > > > > > > > authorization. How will these be provided via the > common > > > > > > interface? > > > > > > > > > > > > > > > > > > > > > > Here is our ZK interface wrapper if you want to see > what > > > > kinds > > > > > of > > > > > > > > > things > > > > > > > > > > we > > > > > > > > > > > use: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/mesos/blob/1.9.0/include/mesos/zookeeper/zookeeper.hpp#L72-L339 > > > > > > > > > > > > > > > > > > > > > > The project has 0 releases and 0 issues, what kind of > > usage > > > > has > > > > > > it > > > > > > > > > seen? > > > > > > > > > > > Has there been any testing yet? Would Offscale.io be > > doing > > > > some > > > > > > of > > > > > > > > the > > > > > > > > > > > testing? > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 13, 2020 at 7:54 PM Samuel Marks < > > > > > sam...@offscale.io > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Apache ZooKeeper <https://zookeeper.apache.org> is a > > > large > > > > > > > > > dependency. > > > > > > > > > > > > Enabling developers and operations to use etcd < > > > > > > https://etcd.io > > > > > > > >, > > > > > > > > > > Consul > > > > > > > > > > > > <https://consul.io>, or ZooKeeper should reduce > > resource > > > > > > > > utilisation > > > > > > > > > > and > > > > > > > > > > > > enable new use cases. > > > > > > > > > > > > > > > > > > > > > > > > There have already been a number of suggestions to > get > > > rid > > > > of > > > > > > > hard > > > > > > > > > > > > dependency on ZooKeeper. For example, see: MESOS-1806 > > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-1806>, > > > > > MESOS-3574 > > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-3574>, > > > > > MESOS-3797 > > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-3797>, > > > > > MESOS-5828 > > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-5828>, > > > > > MESOS-5829 > > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-5829>. > > > > However, > > > > > > > there > > > > > > > > > are > > > > > > > > > > > > difficulties in supporting a few implementations for > > > > > different > > > > > > > > > services > > > > > > > > > > > > with quite distinct data models. > > > > > > > > > > > > > > > > > > > > > > > > A few months ago offscale.io invested in a solution > to > > > > this > > > > > > > > problem > > > > > > > > > - > > > > > > > > > > > > liboffkv <https://github.com/offscale/liboffkv> – a > > > *C++* > > > > > > > library > > > > > > > > > > which > > > > > > > > > > > > provides a *uniform interface over ZooKeeper, Consul > KV > > > and > > > > > > > etcd*. > > > > > > > > It > > > > > > > > > > > > abstracts common features of these services into its > > own > > > > data > > > > > > > model > > > > > > > > > > which > > > > > > > > > > > > is very similar to ZooKeeper’s one. Careful attention > > was > > > > > paid > > > > > > to > > > > > > > > > keep > > > > > > > > > > > > methods both efficient and consistent. It is > > > > cross-platform, > > > > > > > > > > > > open-source (*Apache-2.0 > > > > > > > > > > > > OR MIT*), and is written in C++, with vcpkg > packaging, > > *C > > > > > > library > > > > > > > > > > output > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/offscale/liboffkv/blob/d3d549e/CMakeLists.txt#L29-L35 > > > > > > > > > > > > >*, > > > > > > > > > > > > and additional interfaces in *Go < > > > > > > > > > https://github.com/offscale?q=goffkv > > > > > > > > > > > >*, > > > > > > > > > > > > *Java > > > > > > > > > > > > <https://github.com/offscale/liboffkv-java>*, and > > *Rust > > > > > > > > > > > > <https://github.com/offscale/rsoffkv>*. > > > > > > > > > > > > > > > > > > > > > > > > Offscale.io proposes to replace all ZooKeeper usages > in > > > > Mesos > > > > > > > with > > > > > > > > > > usages > > > > > > > > > > > > of liboffkv. Since all interactions which require > > > ZooKeeper > > > > > in > > > > > > > > Mesos > > > > > > > > > > are > > > > > > > > > > > > conducted through the class Group (and GroupProcess) > > > with a > > > > > > clear > > > > > > > > > > > interface > > > > > > > > > > > > the obvious way to introduce changes is to provide > > > another > > > > > > > > > > implementation > > > > > > > > > > > > of the class which uses liboffkv instead of > ZooKeeper. > > In > > > > > this > > > > > > > case > > > > > > > > > the > > > > > > > > > > > > original implementation may be left unchanged in the > > > > codebase > > > > > > and > > > > > > > > > build > > > > > > > > > > > > flags to select from ZK-only and liboffkv variants > may > > be > > > > > > > > introduced. > > > > > > > > > > > Once > > > > > > > > > > > > the community is confident, you can decide to remove > > the > > > > > > ZK-only > > > > > > > > > > option, > > > > > > > > > > > > and instead only support liboffkv [which internally > has > > > > build > > > > > > > flags > > > > > > > > > for > > > > > > > > > > > > each service]. > > > > > > > > > > > > > > > > > > > > > > > > Removing the hard dependency on ZooKeeper will > simplify > > > > local > > > > > > > > > > deployment > > > > > > > > > > > > for testing purposes as well as enable using Mesos in > > > > > clusters > > > > > > > > > without > > > > > > > > > > > > ZooKeeper, e.g. where etcd or Consul is used for > > > > > coordination. > > > > > > We > > > > > > > > > > expect > > > > > > > > > > > > this to greatly reduce the amount of > resource—network, > > > CPU, > > > > > > disk, > > > > > > > > > > > > memory—usage in a datacenter environment. > > > > > > > > > > > > > > > > > > > > > > > > If the community accepts the initiative, we will > > > integrate > > > > > > > liboffkv > > > > > > > > > > into > > > > > > > > > > > > Mesos. We are also ready to develop the library and > > > > consider > > > > > > any > > > > > > > > > > > suggested > > > > > > > > > > > > improvements. > > > > > > > > > > > > *SAMUEL MARKS* > > > > > > > > > > > > Sydney Medical School | Westmead Institute for > Medical > > > > > > Research | > > > > > > > > > > > > https://linkedin.com/in/samuelmarks > > > > > > > > > > > > Director | Sydney Scientific Foundation Ltd < > > > > > > > > > > > https://sydneyscientific.org> > > > > > > > > > > > > | Offscale.io of Sydney Scientific Pty Ltd < > > > > > > https://offscale.io> > > > > > > > > > > > > *SYDNEY SCIENTIFIC FOUNDATION and THE UNIVERSITY OF > > > SYDNEY* > > > > > > > > > > > > > > > > > > > > > > > > PS: We will be offering similar contributions to > > Chronos > > > > > > > > > > > > <https://mesos.github.io/chronos>, Marathon > > > > > > > > > > > > <https://github.com/mesosphere/marathon>, Aurora > > > > > > > > > > > > <https://github.com/aurora-scheduler/aurora>, and > > > related > > > > > > > > projects. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >