Thanks, Stephen, this is really helpful!

On Tue, Apr 28, 2020 at 6:24 AM Stephen Mallette <spmalle...@gmail.com>
wrote:

> >
> > To step out of the weeds a bit - other than the Zookeeper / Curator
> > example, does anyone know of any other apache projects that have either
> > subprojects or complementary sideprojects they're interdependent upon in
> > their ecosystems?
>
>
> Every Apache project is different, so it's quite possible that the
> experience I have in this area doesn't apply much here, but I'll offer some
> words on the matter in the event that some of it is helpful.
>
> For many years even prior to joining Apache, TinkerPop was quite against
> bringing in driver-style sub-projects. Our main concern was one that I
> think was voiced here in this thread in some fashion, where core developers
> would have to be knowledgeable of the incoming body of work and maintain
> that going forward. For core contributors who were primarily Java
> developers it was difficult to think that we'd suddenly be responsible for
> reviews/VOTEs on Python code, for example.  It was with a bit of
> trepidation that we eventually decided it a good idea and opened the
> project to them. For our purposes we brought all such projects directly
> into our core repository as the thinking was that we wanted to keep all
> aspects of the project unified (testing, release, etc) to ensure that for a
> particular release tag you could be sure that everything worked together.
> We initially started with just Python and developed that as our model for
> how new drivers would arrive (there were already other disparate projects
> out there in other languages).
>
> We wanted a model that ensured a reasonably high bar for acceptance and
> created a rough set of minimum criteria we wanted to have for adding a new
> driver to our release lines. The core of that criteria was a common
> language agnostic test suite that needed to pass for us to consider it
> "ready" in any sense and the project needed to build, test and release
> using Maven (which is our build tool for the project). The former ensured
> that we had a reasonable level of common tested functionality among drivers
> and the latter ensured an easy and consistent way to manage build/release
> practices (which fed nicely into our Docker infrastructure for both full
> builds and for giving non-JVM developers a nice way to develop drivers
> against the latest code without having to be Java experts). Once we
> established this approach with Python, we successfully brought in .NET and
> Javascript.
>
> I think there were a number of nice upsides to deciding to bring in drivers
> in the first place and then in the model for acceptance that we chose:
>
> + We saw a greater diversity of folks contributing in general as the
> ecosystem opened up beyond just the JVM.
> + We saw that the general community coalesced around the "official"
> drivers, contributing as one to them, rather than going off and creating
> one-off projects. I'm not really aware of any third-party drivers right now
> for the languages we support, but if you look at something like Go, there
> are three or more choices. I suppose Go would be our next target for
> official inclusion.
> + Release day was pretty simple despite the complexity of the environment
> with that mixed ecosystem because of our unified build model using Maven
> and there wasn't a lot of disparate tooling exposed to the release manager
> directly.
> + I can't say that we really saw problems with core project developers (who
> mostly new Java) having to review python/c#/javascript. For the most part,
> the contribution quality was high and we managed and became more
> knowledgeable as we went.
> + As we released drivers and core together, we no longer had situations
> where some third-party driver lagged behind some feature in core - if you
> wanted to use the latest core functionality you just used the latest
> release of core and driver and you could be assured they worked together
> and we felt confident saying so.
>
> Doing it over again, I think I would still consider going single repo for
> this situation but I think I might not place the requirement that the
> projects build with Maven. I think Maven has turned-off some contributors
> from those language ecosystems who don't know the JVM. They would have been
> much more comfortable just working more directly with the tool systems that
> they were familiar with. Of course, to get rid of local maven builds
> completely we would have to build a "latest" Docker images so that folks
> didn't need to do that themselves like they do now (also with Maven).
>
> Aside from TinkerPop experiences I will offer that, while I'm not
> completely sure, I think that for a contribution like this one where the
> bulk of the code has been developed outside of the ASF, the DS drivers
> would need to go through an IP Clearance process:
>
> https://incubator.apache.org/ip-clearance/
>
>
>
> On Mon, Apr 27, 2020 at 12:57 PM Joshua McKenzie <jmcken...@apache.org>
> wrote:
>
> > To step out of the weeds a bit - other than the Zookeeper / Curator
> > example, does anyone know of any other apache projects that have either
> > subprojects or complementary sideprojects they're interdependent upon in
> > their ecosystems? I'd like to reach out to some other pmc's for advice
> and
> > feedback on this topic since there's no sense in reinventing the wheel if
> > other projects have wisdom to share on this.
> >
> > On Mon, Apr 27, 2020 at 12:42 PM Joshua McKenzie <jmcken...@apache.org>
> > wrote:
> >
> > > re: ML noise, how hard would it be to filter out JIRA updates
> w/component
> > > "Drivers"? Or from JIRA queries?
> > >
> > > For governance, I see it cutting both ways. If we have two separate
> > > projects and ML's for drivers and C*, how do we keep a coherent view of
> > new
> > > features and roadmap stuff? Do we have CEP's for both projects and tie
> > them
> > > together? Do we drive changes in the driver feature ecosystem via CEP's
> > in
> > > C*?
> > >
> > > In the Venn diagram of overlap vs. non between the two projects, I see
> > > there being more overlap than not.
> > >
> > > On Mon, Apr 27, 2020 at 12:34 PM Dinesh Joshi <djo...@apache.org>
> wrote:
> > >
> > >>
> > >>
> > >> > On Apr 27, 2020, at 2:50 AM, Sylvain Lebresne <lebre...@gmail.com>
> > >> wrote:
> > >> >
> > >> > Fwiw, I agree with the concerns raised by Benedict, and think we
> > should
> > >> > carefully think about how this is handled. Which isn't not a
> rejection
> > >> of
> > >> > the donation in any way.
> > >> >
> > >> > Drivers are not small projects, and the majority of their day to day
> > >> > maintenance is unrelated to the server (and the reverse is true).
> > >> >
> > >> > From the user point of view, I think it would be fabulous that
> > Cassandra
> > >> > appears like one project with a server and some official drivers,
> with
> > >> one
> > >> > coherent website and documentation for all. I'm all for striving for
> > >> that.
> > >>
> > >> +1
> > >>
> > >> > Behind the scenes however, I feel tings should be setup so that some
> > >> amount
> > >> > of
> > >> > separation remains between server and whichever drivers are donated
> > and
> > >> > accepted, or I'm fairly sure things would get messy very
> quickly[1]).
> > >> In my
> > >>
> > >> Can you say more about what "getting messy very quickly" means here?
> > >>
> > >> > mind that means *at a minimum*:
> > >> > - separate JIRA projects.
> > >> > - dedicated _dev_ (and commits) mailing lists.
> > >>
> > >> If we're thinking through how this would be setup, initially we had
> the
> > >> same Jira project for sidecar but now there is a separate one to track
> > >> sidecar specific jiras. At the moment we do not have a separate
> mailing
> > >> list. I think Cassandra dev mailing list's volume is low enough to
> keep
> > >> using the same ML. There is an added value that it gives visibility
> and
> > >> developers don't need to go between multiple mailing lists.
> > >>
> > >> > But it's also worth thinking whether a single pool of committers/PMC
> > >> > members is
> > >> > desirable.
> > >> >
> > >> > Tbc, I'm not sure what is the best way to achieve this within the
> > >> > constraint of
> > >> > the Apache fundation, and maybe I'm just stating the obvious here.
> > >> >
> > >> >
> > >> > [1] fwiw, I say this as someone that at some points in time was
> > >> > simultaneously
> > >> > somewhat actively involved in both Cassandra and the DataStax Java
> > >> driver.
> > >> >
> > >> > --
> > >> > Sylvain
> > >> >
> > >> >
> > >> > On Fri, Apr 24, 2020 at 12:54 AM Benedict Elliott Smith <
> > >> bened...@apache.org>
> > >> > wrote:
> > >> >
> > >> >> Do you have some examples of issues?
> > >> >>
> > >> >> So, to explain my thinking: I believe there is value in most
> > >> contributors
> > >> >> being able to know and understand a majority of what the project
> > >> >> undertakes.  Many people track a wide variety of activity on the
> > >> project,
> > >> >> and whether they express an opinion they probably form one and will
> > >> involve
> > >> >> themselves if they consider it important to do so.  I worry that
> > >> importing
> > >> >> several distinct and only loosely related projects to the same
> > >> governance
> > >> >> and communication structures has a strong potential to undermine
> that
> > >> >> capability, as people begin to assume that activity and
> > >> decision-making is
> > >> >> unrelated to them - and if that happens I think something important
> > is
> > >> lost.
> > >> >>
> > >> >> The sidecar challenges this already but seems hopefully manageable:
> > it
> > >> is
> > >> >> a logical extension of Cassandra, existing primarily to plug gaps
> in
> > >> >> Cassandra's own functionality, and features may migrate to
> Cassandra
> > >> over
> > >> >> time.  It is likely to have releases closely tied to Cassandra
> > itself.
> > >> >> Other subprojects are so far exclusively for consumption by the
> > >> Cassandra
> > >> >> project itself, and are all naturally coupled.
> > >> >>
> > >> >> Drivers however are inherently arms-length endeavours: we publish a
> > >> >> protocol specification, and driver maintainers implement it.  They
> > are
> > >> >> otherwise fairly independent, and while a dialogue is helpful it
> does
> > >> not
> > >> >> need to be controlled by a single entity.  Many drivers will
> continue
> > >> to be
> > >> >> controlled by others, as they have been until now.  We're of course
> > >> able to
> > >> >> ensure there's a strong overlap of governance, which I think would
> be
> > >> very
> > >> >> helpful, and something Curator and Zookeeper seem not to have
> > managed.
> > >> >>
> > >> >> Looking at the Curator website, it also seems to pitch itself as a
> > >> >> relatively opinionated product, and much more than a driver.  I
> hope
> > >> the
> > >> >> recipe for conflict in our case is much more limited given the
> > >> functional
> > >> >> scope of a driver - and anyway better avoided with more integrated,
> > but
> > >> >> still distinct governance.
> > >> >>
> > >> >> That's not to say I don't see some value in the project controlling
> > the
> > >> >> driver directly, I just worry about the above.
> > >> >>
> > >> >>
> > >> >>
> > >> >> On 22/04/2020, 21:25, "Nate McCall" <zznat...@gmail.com> wrote:
> > >> >>
> > >> >>    On Thu, Apr 23, 2020 at 5:37 AM Benedict Elliott Smith <
> > >> >> bened...@apache.org>
> > >> >>    wrote:
> > >> >>
> > >> >>> I welcome the donation, and hope we are able to accept all of the
> > >> >>> drivers.  This is really great news IMO.
> > >> >>>
> > >> >>> I do however wonder if the project may be accumulating too many
> > >> >>> sub-projects?  I wonder if it's time to think about splitting, and
> > >> >> perhaps
> > >> >>> incubating a project for the drivers?
> > >> >>>
> > >> >>
> > >> >>    This is a legit concern and good question, but I think this is
> > more
> > >> a
> > >> >>    natural evolution of growing a project. There is precedent for
> > this
> > >> in
> > >> >>    Spark, Beam, Hadoop and others who have a number of different
> > >> >> repositories
> > >> >>    under the general project umbrella.
> > >> >>
> > >> >>    What I would like to avoid is a situation like with Apache
> Curator
> > >> and
> > >> >>    Apache Zookeeper. The former being a zookeeper client donation
> > from
> > >> >> Netflix
> > >> >>    that came in as a top level project. From the peanut gallery, it
> > >> seems
> > >> >> like
> > >> >>    that has been less than ideal a couple of times in the past
> > >> >> coordinating
> > >> >>    releases, trademarks and such with separate project management.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> ---------------------------------------------------------------------
> > >> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >> >>
> > >> >>
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >>
> >
>

Reply via email to