> > To step out of the weeds a bit - other than the Zookeeper / Curator > example, does anyone know of any other apache projects that have either > subprojects or complementary sideprojects they're interdependent upon in > their ecosystems?
Every Apache project is different, so it's quite possible that the experience I have in this area doesn't apply much here, but I'll offer some words on the matter in the event that some of it is helpful. For many years even prior to joining Apache, TinkerPop was quite against bringing in driver-style sub-projects. Our main concern was one that I think was voiced here in this thread in some fashion, where core developers would have to be knowledgeable of the incoming body of work and maintain that going forward. For core contributors who were primarily Java developers it was difficult to think that we'd suddenly be responsible for reviews/VOTEs on Python code, for example. It was with a bit of trepidation that we eventually decided it a good idea and opened the project to them. For our purposes we brought all such projects directly into our core repository as the thinking was that we wanted to keep all aspects of the project unified (testing, release, etc) to ensure that for a particular release tag you could be sure that everything worked together. We initially started with just Python and developed that as our model for how new drivers would arrive (there were already other disparate projects out there in other languages). We wanted a model that ensured a reasonably high bar for acceptance and created a rough set of minimum criteria we wanted to have for adding a new driver to our release lines. The core of that criteria was a common language agnostic test suite that needed to pass for us to consider it "ready" in any sense and the project needed to build, test and release using Maven (which is our build tool for the project). The former ensured that we had a reasonable level of common tested functionality among drivers and the latter ensured an easy and consistent way to manage build/release practices (which fed nicely into our Docker infrastructure for both full builds and for giving non-JVM developers a nice way to develop drivers against the latest code without having to be Java experts). Once we established this approach with Python, we successfully brought in .NET and Javascript. I think there were a number of nice upsides to deciding to bring in drivers in the first place and then in the model for acceptance that we chose: + We saw a greater diversity of folks contributing in general as the ecosystem opened up beyond just the JVM. + We saw that the general community coalesced around the "official" drivers, contributing as one to them, rather than going off and creating one-off projects. I'm not really aware of any third-party drivers right now for the languages we support, but if you look at something like Go, there are three or more choices. I suppose Go would be our next target for official inclusion. + Release day was pretty simple despite the complexity of the environment with that mixed ecosystem because of our unified build model using Maven and there wasn't a lot of disparate tooling exposed to the release manager directly. + I can't say that we really saw problems with core project developers (who mostly new Java) having to review python/c#/javascript. For the most part, the contribution quality was high and we managed and became more knowledgeable as we went. + As we released drivers and core together, we no longer had situations where some third-party driver lagged behind some feature in core - if you wanted to use the latest core functionality you just used the latest release of core and driver and you could be assured they worked together and we felt confident saying so. Doing it over again, I think I would still consider going single repo for this situation but I think I might not place the requirement that the projects build with Maven. I think Maven has turned-off some contributors from those language ecosystems who don't know the JVM. They would have been much more comfortable just working more directly with the tool systems that they were familiar with. Of course, to get rid of local maven builds completely we would have to build a "latest" Docker images so that folks didn't need to do that themselves like they do now (also with Maven). Aside from TinkerPop experiences I will offer that, while I'm not completely sure, I think that for a contribution like this one where the bulk of the code has been developed outside of the ASF, the DS drivers would need to go through an IP Clearance process: https://incubator.apache.org/ip-clearance/ On Mon, Apr 27, 2020 at 12:57 PM Joshua McKenzie <jmcken...@apache.org> wrote: > To step out of the weeds a bit - other than the Zookeeper / Curator > example, does anyone know of any other apache projects that have either > subprojects or complementary sideprojects they're interdependent upon in > their ecosystems? I'd like to reach out to some other pmc's for advice and > feedback on this topic since there's no sense in reinventing the wheel if > other projects have wisdom to share on this. > > On Mon, Apr 27, 2020 at 12:42 PM Joshua McKenzie <jmcken...@apache.org> > wrote: > > > re: ML noise, how hard would it be to filter out JIRA updates w/component > > "Drivers"? Or from JIRA queries? > > > > For governance, I see it cutting both ways. If we have two separate > > projects and ML's for drivers and C*, how do we keep a coherent view of > new > > features and roadmap stuff? Do we have CEP's for both projects and tie > them > > together? Do we drive changes in the driver feature ecosystem via CEP's > in > > C*? > > > > In the Venn diagram of overlap vs. non between the two projects, I see > > there being more overlap than not. > > > > On Mon, Apr 27, 2020 at 12:34 PM Dinesh Joshi <djo...@apache.org> wrote: > > > >> > >> > >> > On Apr 27, 2020, at 2:50 AM, Sylvain Lebresne <lebre...@gmail.com> > >> wrote: > >> > > >> > Fwiw, I agree with the concerns raised by Benedict, and think we > should > >> > carefully think about how this is handled. Which isn't not a rejection > >> of > >> > the donation in any way. > >> > > >> > Drivers are not small projects, and the majority of their day to day > >> > maintenance is unrelated to the server (and the reverse is true). > >> > > >> > From the user point of view, I think it would be fabulous that > Cassandra > >> > appears like one project with a server and some official drivers, with > >> one > >> > coherent website and documentation for all. I'm all for striving for > >> that. > >> > >> +1 > >> > >> > Behind the scenes however, I feel tings should be setup so that some > >> amount > >> > of > >> > separation remains between server and whichever drivers are donated > and > >> > accepted, or I'm fairly sure things would get messy very quickly[1]). > >> In my > >> > >> Can you say more about what "getting messy very quickly" means here? > >> > >> > mind that means *at a minimum*: > >> > - separate JIRA projects. > >> > - dedicated _dev_ (and commits) mailing lists. > >> > >> If we're thinking through how this would be setup, initially we had the > >> same Jira project for sidecar but now there is a separate one to track > >> sidecar specific jiras. At the moment we do not have a separate mailing > >> list. I think Cassandra dev mailing list's volume is low enough to keep > >> using the same ML. There is an added value that it gives visibility and > >> developers don't need to go between multiple mailing lists. > >> > >> > But it's also worth thinking whether a single pool of committers/PMC > >> > members is > >> > desirable. > >> > > >> > Tbc, I'm not sure what is the best way to achieve this within the > >> > constraint of > >> > the Apache fundation, and maybe I'm just stating the obvious here. > >> > > >> > > >> > [1] fwiw, I say this as someone that at some points in time was > >> > simultaneously > >> > somewhat actively involved in both Cassandra and the DataStax Java > >> driver. > >> > > >> > -- > >> > Sylvain > >> > > >> > > >> > On Fri, Apr 24, 2020 at 12:54 AM Benedict Elliott Smith < > >> bened...@apache.org> > >> > wrote: > >> > > >> >> Do you have some examples of issues? > >> >> > >> >> So, to explain my thinking: I believe there is value in most > >> contributors > >> >> being able to know and understand a majority of what the project > >> >> undertakes. Many people track a wide variety of activity on the > >> project, > >> >> and whether they express an opinion they probably form one and will > >> involve > >> >> themselves if they consider it important to do so. I worry that > >> importing > >> >> several distinct and only loosely related projects to the same > >> governance > >> >> and communication structures has a strong potential to undermine that > >> >> capability, as people begin to assume that activity and > >> decision-making is > >> >> unrelated to them - and if that happens I think something important > is > >> lost. > >> >> > >> >> The sidecar challenges this already but seems hopefully manageable: > it > >> is > >> >> a logical extension of Cassandra, existing primarily to plug gaps in > >> >> Cassandra's own functionality, and features may migrate to Cassandra > >> over > >> >> time. It is likely to have releases closely tied to Cassandra > itself. > >> >> Other subprojects are so far exclusively for consumption by the > >> Cassandra > >> >> project itself, and are all naturally coupled. > >> >> > >> >> Drivers however are inherently arms-length endeavours: we publish a > >> >> protocol specification, and driver maintainers implement it. They > are > >> >> otherwise fairly independent, and while a dialogue is helpful it does > >> not > >> >> need to be controlled by a single entity. Many drivers will continue > >> to be > >> >> controlled by others, as they have been until now. We're of course > >> able to > >> >> ensure there's a strong overlap of governance, which I think would be > >> very > >> >> helpful, and something Curator and Zookeeper seem not to have > managed. > >> >> > >> >> Looking at the Curator website, it also seems to pitch itself as a > >> >> relatively opinionated product, and much more than a driver. I hope > >> the > >> >> recipe for conflict in our case is much more limited given the > >> functional > >> >> scope of a driver - and anyway better avoided with more integrated, > but > >> >> still distinct governance. > >> >> > >> >> That's not to say I don't see some value in the project controlling > the > >> >> driver directly, I just worry about the above. > >> >> > >> >> > >> >> > >> >> On 22/04/2020, 21:25, "Nate McCall" <zznat...@gmail.com> wrote: > >> >> > >> >> On Thu, Apr 23, 2020 at 5:37 AM Benedict Elliott Smith < > >> >> bened...@apache.org> > >> >> wrote: > >> >> > >> >>> I welcome the donation, and hope we are able to accept all of the > >> >>> drivers. This is really great news IMO. > >> >>> > >> >>> I do however wonder if the project may be accumulating too many > >> >>> sub-projects? I wonder if it's time to think about splitting, and > >> >> perhaps > >> >>> incubating a project for the drivers? > >> >>> > >> >> > >> >> This is a legit concern and good question, but I think this is > more > >> a > >> >> natural evolution of growing a project. There is precedent for > this > >> in > >> >> Spark, Beam, Hadoop and others who have a number of different > >> >> repositories > >> >> under the general project umbrella. > >> >> > >> >> What I would like to avoid is a situation like with Apache Curator > >> and > >> >> Apache Zookeeper. The former being a zookeeper client donation > from > >> >> Netflix > >> >> that came in as a top level project. From the peanut gallery, it > >> seems > >> >> like > >> >> that has been less than ideal a couple of times in the past > >> >> coordinating > >> >> releases, trademarks and such with separate project management. > >> >> > >> >> > >> >> > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >> >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >> >> > >> >> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >> > >> >