Hi Samuel, Thanks for showing interest in contributing to the project. Having optionality between ZooKeeper and Etcd would be great for the project and something that has been brought up a few times before, as you noted.
I echo everything that BenM said. As part of the design it would be great to see the migration path for users currently using Mesos with ZooKeeper to Etcd. Ideally, the migration can happen without much user intervention. Additionally, from our past experience, efforts like these are more successful if the people writing the code have experience with how things work in Mesos code base. So I would recommend starting small, maybe have a few engineers work on a couple "newbie" tickets and do some small projects and have those committed to the project. That gives the committers some level of confidence about quality of the code and be more open to bigger changes like etcd integration. It would also help contributors get a better feeling for the lay of the land and see if they are truly interested in maintaining this piece of integration for the long haul. This is a bit of a longer path but I think it would be more a fruitful one. Looking forward to seeing new contributions to Mesos including the above design! Thanks, On Fri, Apr 17, 2020 at 4:52 PM Samuel Marks <sam...@offscale.io> wrote: > Happy to build a design doc, > > To answer your question on what Offscale.io is, it's my software and > biomedical engineering consultancy. Currently it's still rather small, with > only 8 engineers, but I'm expecting & preparing to grow rapidly. > > My philosophy is always open-source and patent-free, so that's what my > consultancy—and for that matter, the charitable research that I fund > through it <https://sydneyscientific.org>—follows. > > The goal of everything we create is: interoperable (cross-platform, > cross-technology, cross-language, multi-cloud); open-source (Apache-2.0 OR > MIT); with a view towards scaling: > > - teams; > - software-development <https://compilers.com.au>; > - infrastructure [this proposed Mesos contribution + our DevOps > tooling]; > - [in the charity's case] facilitating very large-scale medical > diagnostic screening. > > Technologies like Mesos we expect to both optimise resource > allocation—reducing costs and increasing data locality—and award us > 'bragging rights' with which we can gain clients that are already using > Mesos (which, from my experience, is always big corporates… though > hopefully contributions like these will make it attractive to small > companies also). > > So no, we're not going anywhere, and are planning to maintain this library > into the future > > PS: Once accepted by Mesos, we'll be making similar contributions to other > Mesos ecosystem projects like Chronos <https://mesos.github.io/chronos>, > Marathon <https://github.com/mesosphere/marathon>, and Aurora > <https://github.com/aurora-scheduler/aurora> as well as to unrelated > projects (e.g., removing etcd as a hard-dependency from Kubernetes > <https://kubernetes.io>… enabling them to choose between ZooKeeper, etcd, > and Consul). > > Thanks for your continual feedback, > > *SAMUEL MARKS* > Sydney Medical School | Westmead Institute for Medical Research | > https://linkedin.com/in/samuelmarks > Director | Sydney Scientific Foundation Ltd <https://sydneyscientific.org> > | Offscale.io of Sydney Scientific Pty Ltd <https://offscale.io> > > > On Sat, Apr 18, 2020 at 6:58 AM Benjamin Mahler <bmah...@apache.org> > wrote: > > > Oh ok, could you tell us a little more about how you're using Mesos? And > > what offscale.io is? > > > > Strictly speaking, we don't really need packaging and releases as we can > > bundle the dependency in our repo and that's what we do for many of our > > dependencies. > > To me, the most important thing is the commitment to maintain the library > > and address issues that come up. > > I also would lean more towards a run-time flag rather than a build level > > flag, if possible. > > > > I think the best place to start would be to put together a design doc. > The > > act of writing that will force the author to think through the details > (and > > there are a lot of them!), and we'll then get a chance to give feedback. > > You can look through the mailing list for past examples of design docs > (in > > terms of which sections to include, etc). > > > > How does that sound? > > > > On Tue, Apr 14, 2020 at 8:44 PM Samuel Marks <sam...@offscale.io> wrote: > > > > > Dear Benjamin Mahler [and *Developers mailing-list for Apache Mesos*], > > > > > > Thanks for responding so quickly. > > > > > > Actually this entire project I invested—time & money, including a > > > development team—explicitly in order to contribute this to Apache > Mesos. > > So > > > no releases yet, because I wanted to ensure it was up to the > > specification > > > requirements referenced in dev@mesos.apache.org before proceeding with > > > packaging and releases. > > > > > > Tests have been setup in Travis CI for Linux (Ubuntu 18.04) and macOS, > > > happy to set them up elsewhere also. There are also some Windows builds > > > that need a bit of tweaking, then they will be pushed into CI also. We > > are > > > just starting to do some work on reducing build & test times. > > > > > > Would be great to build a checklist of things you want to see before we > > > send the PR, e.g., > > > > > > - ☐ hosted docs; > > > - ☐ CI/CD—including packaging—for Windows, Linux, and macOS; > > > - ☐ releases on GitHub; > > > - ☐ consistent session and auth interface > > > - ☐ different tests [can you expand here?] > > > > > > This is just an example checklist, would be best if you and others can > > > flesh it out, so when we do send the PR it's in an immediately mergable > > > state. > > > > > > BTW: Originally had a debate with my team about whether to send a PR > out > > of > > > the blue—like Microsoft famously did for Node.js > > > <https://github.com/nodejs/node/pull/4765>—or start an *offer thread* > on > > > the developers mailing-list. > > > > > > Looking forward to contributing 🦀 > > > > > > *SAMUEL MARKS* > > > Sydney Medical School | Westmead Institute for Medical Research | > > > https://linkedin.com/in/samuelmarks > > > Director | Sydney Scientific Foundation Ltd < > > https://sydneyscientific.org> > > > | Offscale.io of Sydney Scientific Pty Ltd <https://offscale.io> > > > > > > > > > On Wed, Apr 15, 2020 at 2:38 AM Benjamin Mahler <bmah...@apache.org> > > > wrote: > > > > > > > Thanks for reaching out, a well maintained and well written wrapper > > > > interface to the three backends would certainly make this easier for > us > > > vs > > > > implementing such an interface ourselves. > > > > > > > > Is this the client interface? > > > > > > > > > > > > > > https://github.com/offscale/liboffkv/blob/d31181a1e74c5faa0b7f5d7001879640b4d9f111/liboffkv/client.hpp#L115-L142 > > > > > > > > At a quick glance, three ZK things that we rely on but seem to be > > absent > > > > from the common interface is the ZK session, authentication, and > > > > authorization. How will these be provided via the common interface? > > > > > > > > Here is our ZK interface wrapper if you want to see what kinds of > > things > > > we > > > > use: > > > > > > > > > > > > > > https://github.com/apache/mesos/blob/1.9.0/include/mesos/zookeeper/zookeeper.hpp#L72-L339 > > > > > > > > The project has 0 releases and 0 issues, what kind of usage has it > > seen? > > > > Has there been any testing yet? Would Offscale.io be doing some of > the > > > > testing? > > > > > > > > On Mon, Apr 13, 2020 at 7:54 PM Samuel Marks <sam...@offscale.io> > > wrote: > > > > > > > > > Apache ZooKeeper <https://zookeeper.apache.org> is a large > > dependency. > > > > > Enabling developers and operations to use etcd <https://etcd.io>, > > > Consul > > > > > <https://consul.io>, or ZooKeeper should reduce resource > utilisation > > > and > > > > > enable new use cases. > > > > > > > > > > There have already been a number of suggestions to get rid of hard > > > > > dependency on ZooKeeper. For example, see: MESOS-1806 > > > > > <https://issues.apache.org/jira/browse/MESOS-1806>, MESOS-3574 > > > > > <https://issues.apache.org/jira/browse/MESOS-3574>, MESOS-3797 > > > > > <https://issues.apache.org/jira/browse/MESOS-3797>, MESOS-5828 > > > > > <https://issues.apache.org/jira/browse/MESOS-5828>, MESOS-5829 > > > > > <https://issues.apache.org/jira/browse/MESOS-5829>. However, there > > are > > > > > difficulties in supporting a few implementations for different > > services > > > > > with quite distinct data models. > > > > > > > > > > A few months ago offscale.io invested in a solution to this > problem > > - > > > > > liboffkv <https://github.com/offscale/liboffkv> – a *C++* library > > > which > > > > > provides a *uniform interface over ZooKeeper, Consul KV and etcd*. > It > > > > > abstracts common features of these services into its own data model > > > which > > > > > is very similar to ZooKeeper’s one. Careful attention was paid to > > keep > > > > > methods both efficient and consistent. It is cross-platform, > > > > > open-source (*Apache-2.0 > > > > > OR MIT*), and is written in C++, with vcpkg packaging, *C library > > > output > > > > > < > > > > > > https://github.com/offscale/liboffkv/blob/d3d549e/CMakeLists.txt#L29-L35 > > > > > >*, > > > > > and additional interfaces in *Go < > > https://github.com/offscale?q=goffkv > > > > >*, > > > > > *Java > > > > > <https://github.com/offscale/liboffkv-java>*, and *Rust > > > > > <https://github.com/offscale/rsoffkv>*. > > > > > > > > > > Offscale.io proposes to replace all ZooKeeper usages in Mesos with > > > usages > > > > > of liboffkv. Since all interactions which require ZooKeeper in > Mesos > > > are > > > > > conducted through the class Group (and GroupProcess) with a clear > > > > interface > > > > > the obvious way to introduce changes is to provide another > > > implementation > > > > > of the class which uses liboffkv instead of ZooKeeper. In this case > > the > > > > > original implementation may be left unchanged in the codebase and > > build > > > > > flags to select from ZK-only and liboffkv variants may be > introduced. > > > > Once > > > > > the community is confident, you can decide to remove the ZK-only > > > option, > > > > > and instead only support liboffkv [which internally has build flags > > for > > > > > each service]. > > > > > > > > > > Removing the hard dependency on ZooKeeper will simplify local > > > deployment > > > > > for testing purposes as well as enable using Mesos in clusters > > without > > > > > ZooKeeper, e.g. where etcd or Consul is used for coordination. We > > > expect > > > > > this to greatly reduce the amount of resource—network, CPU, disk, > > > > > memory—usage in a datacenter environment. > > > > > > > > > > If the community accepts the initiative, we will integrate liboffkv > > > into > > > > > Mesos. We are also ready to develop the library and consider any > > > > suggested > > > > > improvements. > > > > > *SAMUEL MARKS* > > > > > Sydney Medical School | Westmead Institute for Medical Research | > > > > > https://linkedin.com/in/samuelmarks > > > > > Director | Sydney Scientific Foundation Ltd < > > > > https://sydneyscientific.org> > > > > > | Offscale.io of Sydney Scientific Pty Ltd <https://offscale.io> > > > > > *SYDNEY SCIENTIFIC FOUNDATION and THE UNIVERSITY OF SYDNEY* > > > > > > > > > > PS: We will be offering similar contributions to Chronos > > > > > <https://mesos.github.io/chronos>, Marathon > > > > > <https://github.com/mesosphere/marathon>, Aurora > > > > > <https://github.com/aurora-scheduler/aurora>, and related > projects. > > > > > > > > > > > > > > >