Hey John, I'm not sure if Jim (proposed PMC chair) has surveyed all the PMC members recently but I'm in touch with a good number of the PMC so I think I have a good sense, and I'll take a whack at answering this:
It looks like of the proposed 24 PMC members, 15 work at Cloudera, the original organization which developed Impala prior to its contribution to the incubator. The other 9 are at a mix of employers, but as best I know, are not currently "sponsored" by their employers to contribute to the Impala project. As is natural with any project, it will be easier for people to be active contributors and committers if someone is employing them to do so, and the PMC membership reflects that. However, I've seen that the Impala community has also made great efforts in a few areas: - Jim has been making a bunch of "tutorial" style posts[1] to the dev mailing list with instructions on how to navigate the Impala code base, write tests, and fix example bugs for new contributors. I believe he has also added these to various repositories like the "help wanted" page at the ASF as well as "yourfirstpr" and "up-for-grabs" on github. Unfortunately it seems like - Similarly, the Impala community maintains a long list of "newbie" issues which a new contributor can use to ramp up with[2]. In other communities I've seen these "newbie" JIRAs are a nice way for people to take on a small patch before getting started with something large, especially in large code bases. - Questions on the dev list from new contributors are always answered promptly and with good amounts of encouragement, eg [3] and [4] from earlier this week. - In addition to trying to recruit new developers through the ways described above, it's clear that design discussion, project decisions, and code review are all being done in the open based on Apache principles. A quick glance at recent dev@ archives shows various discussions about project scope, implementation choices, etc. They've also been consistently adding new committers and PMC members through incubation. - Lastly, I'll note that in the last year the Impala community has started to build more close ties with other ASF communities. For example, Impala and Apache Kudu are now sharing common code for RPC, and representatives from the Impala community are now contributing regularly on Apache Parquet (two were recently voted committers). These cross-project collaborations, are, IMO, one of the things that make the ASF more than a "Switzerland for intellectual property" as some have disparagingly described it, and it's great to see Impala taking part in that. That said, I don't want to paint a completely rosy picture: even with the above efforts, it seems the majority of day-to-day contributions are still coming from contributors affiliated with a single entity. The Impala community has recognized that, and, based on the above efforts, I imagine they plan to continue to try to expand the community even after graduation. As for whether this should block graduation, I'll quote Roy here from a 2012 discussion: > > There is no diversity requirement at the ASF. There is a behavior > requirement for graduation and a behavior requirement for TLPs. > We must not confuse the two. If the Incubator says that there is a > diversity requirement for graduation, ignore it (or at least figure > out what the docs were supposed to say and then do that). ... and other proponents of the same philosophy can be found from other graduation proposals. It's clear to me that Impala is *behaving* like a TLP, and it's *despite* their best efforts that the diversity hasn't improved as much as one might hope. The unfortunate fact of life here is that the number of engineers out there who are interested in contributing code to query engines is relatively low (and most of those few are kept very busy by their employers), so it's not terribly surprising that we haven't seen hordes come out of the woodwork. As for risk of abandonment, I believe that it's quite low. Cloudera has historically contributed to many other ASF projects and with rare exceptions the level of contribution has grown over time rather than diminished. This is certainly the case with Impala as well, if you compare the number of active contributors over time. Some of Cloudera's core products are powered by Impala, and running at large numbers of enterprise customers with multi-year support contracts, so it would be a pretty long shot to imagine it being abandoned. (lest anyone shout "bias!", I'll disclose that Cloudera also pays my salary, but on a different internal team) -Todd [1] https://lists.apache.org/list.html?dev@impala.apache. org:lte=1y:%22New%20Impala%20Contributors%22 [2] https://issues.apache.org/jira/browse/IMPALA-6096?filter=12341668 [3] https://lists.apache.org/thread.html/02e19a37be25f3db07b874a7602b3c 4ac66d2e1499b66bda53b561f6@%3Cdev.impala.apache.org%3E [4] https://lists.apache.org/thread.html/4e77e2f13a9a69fa7c55c413bfc529 a439c63aa60458649d5ece072d@%3Cdev.impala.apache.org%3E On Mon, Nov 6, 2017 at 2:00 PM, John D. Ament <johndam...@apache.org> wrote: > The only question I have is the typical distribution of proposed PMC > members to companies. > > John > >