Re: A renewed plea for help [was Re: Recruiting more maintainers for Apache Arrow]

Micah Kornfield Tue, 22 Jan 2019 19:27:58 -0800

Hi Wes,
I can take a stab at going through the older C++/Python PRs as a first pass
triage (I also appreciate that I'm just getting back to project, so if you
prefer I hold off on this I understand).


Is there a good mechanism to have a committer/PMC member look at PRs after
a first pass by a contributor?  (e.g. round-robin or by area of focus)

Thanks,
Micah

On Tue, Jan 22, 2019 at 2:58 PM Wes McKinney <wesmck...@gmail.com> wrote:

> hi folks,
>
> It's been 3 months since I sent this e-mail so I thought I would
> follow up about where things stand. The project continues to grow very
> fast and so there is a ton of pull request and JIRA gardening to do.
> For the 0.12.0 release, my personal burden of patch-merging stood at
> about 50%.
>
> http://arrow.apache.org/release/0.12.0.html#patch-committers
>
> We currently have 93 pull requests open. There are some stale PRs but
> for the most part these are good / mostly-non-stale PRs in occasional
> need of rebasing or following up with contributors about making
> changes.
>
> There were 1540 patches merged into the project in 2018 (excluding the
> Parquet merge) -- that's more than 4 patches per day. Evidence
> suggests that the overall patch count for 2019 will be even higher; if
> I had to guess somewhere well over 2000. Out of last year's patches, I
> merged 1028, i.e. 2 out of every 3. If we are to be able to take on
> 2000 or more patches this year, we'll need more help. If you are
> neither a committer nor a PMC member, you can still help with code
> review and discussions to help contributors get their work into
> merge-ready state.
>
> While I would like to the share of patch maintenance more distributed,
> I'll do what I have to in order to keep the patches flowing as fast as
> possible into master, but contributors and other maintainers can help
> with the Always Be Closing mindset -- the 80/20 rule or 90/10 rule
> frequently applies. In many cases it is better to merge a patch and
> open up a JIRA for follow up improvements if there is uncertainty
> about whether something is "done". As they say "Done [Merged] is
> better than Perfect" (as long as the build isn't broken)
>
> Additionally, please be proactive about opening JIRA issues. Out of
> the last 1000 issues created (ARROW-4326 to ARROW-3327), 501 of them
> were created by just 5 people (Wes, Antoine, Uwe, Krisztian, Kou). In
> accordance with the "Always Be Closing" mindset, I frequently create
> issues as a way of closing a discussion where there is no urgent next
> step, but some work needs to be done in the future. We need to capture
> information and file it away as efficiently as possible so we can move
> on to other work.
>
> Thank you,
> Wes
>
> On Mon, Oct 15, 2018 at 9:13 AM Wes McKinney <wesmck...@gmail.com> wrote:
> >
> > hi folks,
> >
> > It's been a few months, but as Apache Arrow is rapidly becoming a
> > critical dependency of next-generation data applications (see, for
> > example, RAPIDS just launched by NVIDIA http://rapids.ai/), we are
> > quite seriously in need of more project maintainers, or in lieu of new
> > individual contributors, additional direct funding. We are especially
> > in need of corporations dependent on this software to help carry the
> > load of JIRA gardening, code review, build and CI tooling, packaging
> > automation, developer workflow tools, and so on.
> >
> > One of the casualties of the growing maintenance burden of this
> > project is that it's increasingly difficult for people like me who
> > know the project internals very well to allocate time to working on
> > new functionality. When I talk to people about the project they often
> > ask me things like "When will X, Y, or Z functionality be ready?" and
> > my answer is often "I don't know, it depends on whether more people
> > show up to help with the maintenance workload so people can spend more
> > time building new things". This is coupled with the frustration that
> > newcomers can experience where the learning curve is very steep to be
> > able to contribute significantly to new functionality. The only way
> > out is to recruit more people to help keep things orderly, take out
> > the proverbial garbage, and keep the project healthy.
> >
> > If anyone reading has the bandwidth to help with maintaining the
> > project, or to contribute funds to support maintenance, please let us
> > know.
> >
> > Special thanks to Antoine, Kou, Kristztian, Phillip, and Uwe for their
> > work on tooling, packaging, and other development processes for the
> > 0.10 and 0.11 releases.
> >
> > Thanks,
> > Wes
> >
> > On Mon, Jul 2, 2018 at 11:40 AM Antoine Pitrou <anto...@python.org>
> wrote:
> > >
> > >
> > > Hi,
> > >
> > > Le 02/07/2018 à 15:58, Wes McKinney a écrit :
> > > > * http://ivory.idyll.org/blog/2018-how-open-is-too-open.html
> > > > * http://ivory.idyll.org/blog/2018-oss-framework-cpr.html
> > >
> > > Very good articles, but I would stress that some of the mechanisms
> > > proposed lack metrics in their favour.  Two particular examples that I
> > > know about:
> > >
> > > 1)
> > >
> > > """ I seem to recall Martin van Loewis offering to review one
> externally
> > > contributed patch for every ten other patches reviewed by the
> submitter.
> > > (I can’t find the link, sorry!) This imposes work requirements on
> > > would-be contributors that obligate them to contribute substantively to
> > > the project maintenance, before their pet feature gets implemented. """
> > >
> > > Martin's offer was almost never taken up, although he expressed it many
> > > times during many years.  I think there are two factors to it:
> > >
> > > a) Cost.  As an occasional contributor, I could understand having to do
> > > a review before contributing a patch of mine, but not having to do 5 or
> > > more reviews for each patch I contribute.  The effort asked is much too
> > > high, and you're probably discouraging people who are discovering the
> > > project, even before they could get hooked on it.
> > >
> > > b) Difficult.  It's much more difficult and intimidating to review
> > > someone else's PR, than to propose your own changes knowing that it
> will
> > > be reviewed by (you are assuming) competent people.  So this mechanism
> > > is excluding first-time contributors, which is probably *not* what you
> want.
> > >
> > > 2)
> > >
> > > """ Some projects have excellent incubators, like the Python Core
> > > Mentorship Program, where people who are interested in applying their
> > > effort to recruiting new contributors can do so. """
> > >
> > > Actually, it doesn't seem to me that a significant proportion of
> > > frequent Python contributors have gone through the core mentorship
> > > process.  It probably got us a handful of one-time contributions.
> > > Pointing to the Python core mentorship program as an "excellent
> > > incubator" sounds rather far-fetched to me.
> > >
> > > Generally speaking, there's a limit to the usefulness of hand-holding
> > > contributors, especially if your project is rather complex (as Python
> > > is), because the blocking point for contributors is *not* that the
> > > development mailing-list is a bit intimidating (as was claimed by the
> > > people who founded the Python core mentorship program).
> > >
> > >
> > > PS : as a matter of fact, the general rate of contributions to Python
> > > has been *decreasing* for years.
> > >
> > > Regards
> > >
> > > Antoine.
>

Re: A renewed plea for help [was Re: Recruiting more maintainers for Apache Arrow]

Reply via email to