>
> To step out of the weeds a bit - other than the Zookeeper / Curator
> example, does anyone know of any other apache projects that have either
> subprojects or complementary sideprojects they're interdependent upon in
> their ecosystems?


Every Apache project is different, so it's quite possible that the
experience I have in this area doesn't apply much here, but I'll offer some
words on the matter in the event that some of it is helpful.

For many years even prior to joining Apache, TinkerPop was quite against
bringing in driver-style sub-projects. Our main concern was one that I
think was voiced here in this thread in some fashion, where core developers
would have to be knowledgeable of the incoming body of work and maintain
that going forward. For core contributors who were primarily Java
developers it was difficult to think that we'd suddenly be responsible for
reviews/VOTEs on Python code, for example.  It was with a bit of
trepidation that we eventually decided it a good idea and opened the
project to them. For our purposes we brought all such projects directly
into our core repository as the thinking was that we wanted to keep all
aspects of the project unified (testing, release, etc) to ensure that for a
particular release tag you could be sure that everything worked together.
We initially started with just Python and developed that as our model for
how new drivers would arrive (there were already other disparate projects
out there in other languages).

We wanted a model that ensured a reasonably high bar for acceptance and
created a rough set of minimum criteria we wanted to have for adding a new
driver to our release lines. The core of that criteria was a common
language agnostic test suite that needed to pass for us to consider it
"ready" in any sense and the project needed to build, test and release
using Maven (which is our build tool for the project). The former ensured
that we had a reasonable level of common tested functionality among drivers
and the latter ensured an easy and consistent way to manage build/release
practices (which fed nicely into our Docker infrastructure for both full
builds and for giving non-JVM developers a nice way to develop drivers
against the latest code without having to be Java experts). Once we
established this approach with Python, we successfully brought in .NET and
Javascript.

I think there were a number of nice upsides to deciding to bring in drivers
in the first place and then in the model for acceptance that we chose:

+ We saw a greater diversity of folks contributing in general as the
ecosystem opened up beyond just the JVM.
+ We saw that the general community coalesced around the "official"
drivers, contributing as one to them, rather than going off and creating
one-off projects. I'm not really aware of any third-party drivers right now
for the languages we support, but if you look at something like Go, there
are three or more choices. I suppose Go would be our next target for
official inclusion.
+ Release day was pretty simple despite the complexity of the environment
with that mixed ecosystem because of our unified build model using Maven
and there wasn't a lot of disparate tooling exposed to the release manager
directly.
+ I can't say that we really saw problems with core project developers (who
mostly new Java) having to review python/c#/javascript. For the most part,
the contribution quality was high and we managed and became more
knowledgeable as we went.
+ As we released drivers and core together, we no longer had situations
where some third-party driver lagged behind some feature in core - if you
wanted to use the latest core functionality you just used the latest
release of core and driver and you could be assured they worked together
and we felt confident saying so.

Doing it over again, I think I would still consider going single repo for
this situation but I think I might not place the requirement that the
projects build with Maven. I think Maven has turned-off some contributors
from those language ecosystems who don't know the JVM. They would have been
much more comfortable just working more directly with the tool systems that
they were familiar with. Of course, to get rid of local maven builds
completely we would have to build a "latest" Docker images so that folks
didn't need to do that themselves like they do now (also with Maven).

Aside from TinkerPop experiences I will offer that, while I'm not
completely sure, I think that for a contribution like this one where the
bulk of the code has been developed outside of the ASF, the DS drivers
would need to go through an IP Clearance process:

https://incubator.apache.org/ip-clearance/



On Mon, Apr 27, 2020 at 12:57 PM Joshua McKenzie <jmcken...@apache.org>
wrote:

> To step out of the weeds a bit - other than the Zookeeper / Curator
> example, does anyone know of any other apache projects that have either
> subprojects or complementary sideprojects they're interdependent upon in
> their ecosystems? I'd like to reach out to some other pmc's for advice and
> feedback on this topic since there's no sense in reinventing the wheel if
> other projects have wisdom to share on this.
>
> On Mon, Apr 27, 2020 at 12:42 PM Joshua McKenzie <jmcken...@apache.org>
> wrote:
>
> > re: ML noise, how hard would it be to filter out JIRA updates w/component
> > "Drivers"? Or from JIRA queries?
> >
> > For governance, I see it cutting both ways. If we have two separate
> > projects and ML's for drivers and C*, how do we keep a coherent view of
> new
> > features and roadmap stuff? Do we have CEP's for both projects and tie
> them
> > together? Do we drive changes in the driver feature ecosystem via CEP's
> in
> > C*?
> >
> > In the Venn diagram of overlap vs. non between the two projects, I see
> > there being more overlap than not.
> >
> > On Mon, Apr 27, 2020 at 12:34 PM Dinesh Joshi <djo...@apache.org> wrote:
> >
> >>
> >>
> >> > On Apr 27, 2020, at 2:50 AM, Sylvain Lebresne <lebre...@gmail.com>
> >> wrote:
> >> >
> >> > Fwiw, I agree with the concerns raised by Benedict, and think we
> should
> >> > carefully think about how this is handled. Which isn't not a rejection
> >> of
> >> > the donation in any way.
> >> >
> >> > Drivers are not small projects, and the majority of their day to day
> >> > maintenance is unrelated to the server (and the reverse is true).
> >> >
> >> > From the user point of view, I think it would be fabulous that
> Cassandra
> >> > appears like one project with a server and some official drivers, with
> >> one
> >> > coherent website and documentation for all. I'm all for striving for
> >> that.
> >>
> >> +1
> >>
> >> > Behind the scenes however, I feel tings should be setup so that some
> >> amount
> >> > of
> >> > separation remains between server and whichever drivers are donated
> and
> >> > accepted, or I'm fairly sure things would get messy very quickly[1]).
> >> In my
> >>
> >> Can you say more about what "getting messy very quickly" means here?
> >>
> >> > mind that means *at a minimum*:
> >> > - separate JIRA projects.
> >> > - dedicated _dev_ (and commits) mailing lists.
> >>
> >> If we're thinking through how this would be setup, initially we had the
> >> same Jira project for sidecar but now there is a separate one to track
> >> sidecar specific jiras. At the moment we do not have a separate mailing
> >> list. I think Cassandra dev mailing list's volume is low enough to keep
> >> using the same ML. There is an added value that it gives visibility and
> >> developers don't need to go between multiple mailing lists.
> >>
> >> > But it's also worth thinking whether a single pool of committers/PMC
> >> > members is
> >> > desirable.
> >> >
> >> > Tbc, I'm not sure what is the best way to achieve this within the
> >> > constraint of
> >> > the Apache fundation, and maybe I'm just stating the obvious here.
> >> >
> >> >
> >> > [1] fwiw, I say this as someone that at some points in time was
> >> > simultaneously
> >> > somewhat actively involved in both Cassandra and the DataStax Java
> >> driver.
> >> >
> >> > --
> >> > Sylvain
> >> >
> >> >
> >> > On Fri, Apr 24, 2020 at 12:54 AM Benedict Elliott Smith <
> >> bened...@apache.org>
> >> > wrote:
> >> >
> >> >> Do you have some examples of issues?
> >> >>
> >> >> So, to explain my thinking: I believe there is value in most
> >> contributors
> >> >> being able to know and understand a majority of what the project
> >> >> undertakes.  Many people track a wide variety of activity on the
> >> project,
> >> >> and whether they express an opinion they probably form one and will
> >> involve
> >> >> themselves if they consider it important to do so.  I worry that
> >> importing
> >> >> several distinct and only loosely related projects to the same
> >> governance
> >> >> and communication structures has a strong potential to undermine that
> >> >> capability, as people begin to assume that activity and
> >> decision-making is
> >> >> unrelated to them - and if that happens I think something important
> is
> >> lost.
> >> >>
> >> >> The sidecar challenges this already but seems hopefully manageable:
> it
> >> is
> >> >> a logical extension of Cassandra, existing primarily to plug gaps in
> >> >> Cassandra's own functionality, and features may migrate to Cassandra
> >> over
> >> >> time.  It is likely to have releases closely tied to Cassandra
> itself.
> >> >> Other subprojects are so far exclusively for consumption by the
> >> Cassandra
> >> >> project itself, and are all naturally coupled.
> >> >>
> >> >> Drivers however are inherently arms-length endeavours: we publish a
> >> >> protocol specification, and driver maintainers implement it.  They
> are
> >> >> otherwise fairly independent, and while a dialogue is helpful it does
> >> not
> >> >> need to be controlled by a single entity.  Many drivers will continue
> >> to be
> >> >> controlled by others, as they have been until now.  We're of course
> >> able to
> >> >> ensure there's a strong overlap of governance, which I think would be
> >> very
> >> >> helpful, and something Curator and Zookeeper seem not to have
> managed.
> >> >>
> >> >> Looking at the Curator website, it also seems to pitch itself as a
> >> >> relatively opinionated product, and much more than a driver.  I hope
> >> the
> >> >> recipe for conflict in our case is much more limited given the
> >> functional
> >> >> scope of a driver - and anyway better avoided with more integrated,
> but
> >> >> still distinct governance.
> >> >>
> >> >> That's not to say I don't see some value in the project controlling
> the
> >> >> driver directly, I just worry about the above.
> >> >>
> >> >>
> >> >>
> >> >> On 22/04/2020, 21:25, "Nate McCall" <zznat...@gmail.com> wrote:
> >> >>
> >> >>    On Thu, Apr 23, 2020 at 5:37 AM Benedict Elliott Smith <
> >> >> bened...@apache.org>
> >> >>    wrote:
> >> >>
> >> >>> I welcome the donation, and hope we are able to accept all of the
> >> >>> drivers.  This is really great news IMO.
> >> >>>
> >> >>> I do however wonder if the project may be accumulating too many
> >> >>> sub-projects?  I wonder if it's time to think about splitting, and
> >> >> perhaps
> >> >>> incubating a project for the drivers?
> >> >>>
> >> >>
> >> >>    This is a legit concern and good question, but I think this is
> more
> >> a
> >> >>    natural evolution of growing a project. There is precedent for
> this
> >> in
> >> >>    Spark, Beam, Hadoop and others who have a number of different
> >> >> repositories
> >> >>    under the general project umbrella.
> >> >>
> >> >>    What I would like to avoid is a situation like with Apache Curator
> >> and
> >> >>    Apache Zookeeper. The former being a zookeeper client donation
> from
> >> >> Netflix
> >> >>    that came in as a top level project. From the peanut gallery, it
> >> seems
> >> >> like
> >> >>    that has been less than ideal a couple of times in the past
> >> >> coordinating
> >> >>    releases, trademarks and such with separate project management.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >> >>
> >> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>

Reply via email to