This is certainly a really important thread and a really important
topic.  I decided to wait a while to jump in because I have more
concerns than answers at this point.  I think everyone here raises
good points.  I especially like LoPrestos email that had the following
key things

"
1. Balance of committers to submissions. As the project continues to
grow, we have far more people providing code than can review it.
2. Quality of PR. Not that the code is necessarily bad, but the PR
doesn’t clearly explain the problem and how they are solving it,
provide test cases, provide templates or a Docker container if
interacting with an external service, etc. All of these things add up
to make the cost of reviewing higher.
3. What PRs cover. Previously, there was still a lot of low-hanging
fruit, and less complexity. While the project is still fairly cleanly
organized, many PRs now are less “add this simple functionality” and
more “improve this complicated feature.”
"

I'd add a new 4th one to that which is 'security relevance' of the
submission.  We're increasingly spending a ton of time on dependency
version maintenance, improving security, locking down things that make
it too easy to be used insecurely, and providing utilities to make it
easier than ever to secure.  When new contribs come in that don't
share that approach it is like 'do not want'.  But then again, that
isn't necessarily clear to a new contributor either.

For all these examples the struggle is very real and the best approach
to me isn't clear.  I do think something along the lines of 'lack of
activity/response for some time interval' results in closure is the
most likely option.  However, that should only be when the submitter
of the PR isn't responsive - not just that reviewers aren't making
progress.  Also, we should avoid anything involving ASF infra having
to do our dirty work.  I'd rather we did empty commits to force the PR
closed.  A couple good examples that are hanging out there right now
are the Apache Pulsar (incubating) PR and the Marklogic one.  These
are potentially really valuable contributions.  They're also very hard
to test.  Hard to ensure security considerations are met or have
aligned interests, and in the case of the Marklogic one have had very
time consuming L&N implications.

I'll add another dimension to this conversation.  We need, in my
opinion, to move all extension nars into a separate
repo/branch/something and have them start to live with independent
release cycles.  Many details of how to best do that are TBD and of
course having an extension registry for that would be also very
valuable.  But, we can make that move without it.  I think this will
help focus PRs, review processes, etc.. to a more reasonable point.
It will dramatically speed up build times, make the CI/Travis process
perform better, etc..  It will also mean though we'd probably deal
with more votes.  But it is a tradeoff i'm ready for personally.

Thanks
Joe

On Wed, Sep 19, 2018 at 9:13 AM Bryan Bende <bbe...@gmail.com> wrote:
>
> I'm in favor of voting to formalize the process.
>
> As part of voting we would need a draft of whatever our
> guidelines/rules are, even if it is just a modified version of Beam or
> Metron's, but basically I think it needs to be clear what we are
> voting on.
> On Tue, Sep 18, 2018 at 12:52 PM Andy LoPresto <alopre...@apache.org> wrote:
> >
> > Pierre,
> >
> > I’m a +0 [1].
> >
> > [1] 
> > https://www.apache.org/foundation/voting.html#expressing-votes-1-0-1-and-fractions
> >
> > Andy LoPresto
> > alopre...@apache.org
> > alopresto.apa...@gmail.com
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
> > On Sep 18, 2018, at 7:57 AM, Otto Fowler <ottobackwa...@gmail.com> wrote:
> >
> > I didn’t mean to say that the Metron way is better, I’m sorry if it came
> > off as that.
> > My main point was rather that we side-stepped a lot of things by limiting
> > the case to ‘contributor non-responsiveness’.
> >
> > Had we known about the github integration possibilities  at the time, we
> > certainly would have considered that implementation
> > as opposed to the more manual process we have now.
> >
> >
> >
> > On September 18, 2018 at 09:39:56, Pierre Villard (
> > pierre.villard...@gmail.com) wrote:
> >
> > Otto, I think using the embedded option offered by Github is better as I
> > don't think submitting requests to Apache Infra for this is great.
> >
> > What would be the best way forward on this subject?
> > Andy, do you still have concerns and/or comments based on the feedback?
> > Should it go through a formal VOTE thread?
> >
> > Happy to take care of it if there is a consensus.
> >
> > Thanks,
> > Pierre
> >
> >
> > Le dim. 16 sept. 2018 à 15:38, Otto Fowler <ottobackwa...@gmail.com> a
> > écrit :
> >
> > In Metron we just put something like this in place, but not using a bot.
> > We limit it to PR’s where the contributor has gone inactive.
> > https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
> >
> > 2.6.1 Inactive Pull Requests
> >
> > Contributions can often take a significant amount of time to complete the
> > code review process. This process requires active participation from the
> > contributor. If the contributor is unable to actively participate, the
> >
> > pull
> >
> > request is unlikely to successfully complete this process.
> >
> > Pull Requests that have failed to receive active participation from the
> > contributor for an extended period of time risk being abandoned. Any
> > committer can submit a request for Apache Infra to close a pull request
> > that has been abandoned according to the following guidelines.
> >
> > - A pull request is 'inactive' if no comments or updates have been made
> > by the contributor in the previous 4 weeks.
> >
> >
> > - For any 'inactive' pull request, a committer can request from the
> > contributor justification for keeping the pull request open.
> >
> >
> > - The committer's request should be made as a public comment on the pull
> > request. The committer should refer the contributor to these development
> > guidelines for inactive pull requests.
> >
> >
> > - If the contributor publically responds to the request, the pull
> > request is no longer consider 'inactive'.
> >
> >
> > - If the contributor does not respond to the request within 2 weeks, the
> > pull request is considered 'abandoned'.
> >
> >
> > - A committer can cast a -1 vote on any 'abandoned' pull request using
> > these development guidelines as justification.
> >
> >
> > - A committer can submit a request to Apache Infra to close the
> > 'abandoned' pull request based on this -1 vote.
> >
> >
> >
> > On September 15, 2018 at 22:22:17, Mike Thomsen (mikerthom...@gmail.com)
> > wrote:
> >
> > I am in favor of these changes. One thing we should consider is looking
> >
> > for
> >
> > old PRs that are close enough to being done that we could rebase them,
> > clean them up a little and add them to master. Just a thought.
> >
> > Thanks,
> >
> > Mike
> >
> > On Sat, Sep 15, 2018 at 1:34 PM Mark Payne <marka...@hotmail.com> wrote:
> >
> > I'm 100% on-board here. I brought up this same topic a couple of months
> > ago,
> > but the thread kind of digressed (as these things tend to do on large
> > mailing lists).
> > I am in favor of a 30 day period with a reminder that gives the
> > contributor an extra
> > week before closing the PR. If the contributor is simply busy and not
> >
> > able
> >
> > to finish up
> > for a while, a simple comment to that effect would allow the PR to stay
> > open. At least
> > this way we know whether a PR is in-progress or just lingering and will
> > never get
> > any progress.
> >
> > While there are times that the committers are at fault, I think that's
> >
> > a
> >
> > separate discussion
> > that we can have. Both sides of the equation have to be addressed, and
> > that should not
> > prevent us from closing out stale PR's.
> >
> > That being said, I think closing out stale PR's will actually improve
> >
> > the
> >
> > committers' review
> > rate. I sometimes start looking through the list of PR's to review and
> > then get overwhelmed
> > because there are so many of them right now, and almost all of them
> >
> > have
> >
> > a
> >
> > comment of
> > some sort on them. Often I have no idea if the PR is still being worked
> >
> > on
> >
> > or not. If we reduce
> > the number of open PR's down to only those that are still being worked,
> > it's a lot less overwhelming
> > for the committers to look through and see what needs to be reviewed.
> >
> > It's also worth nothing that this is not just something that Beam does.
> >
> > I
> >
> > know Kubernetes has a similar
> > mechanism in place, and I'm guessing that this is a pretty common
> >
> > practice
> >
> > in general. And one that
> > I think will definitely help out both committers and contributors and
> >
> > make
> >
> > the project more approachable
> > from those who are interested in getting involved.
> >
> > Thanks
> > -Mark
> >
> >
> > On Sep 15, 2018, at 12:08 PM, Jeremy Dyer <jdy...@gmail.com> wrote:
> >
> > Andy - That’s a good point. What I had in my mind was sort of like
> >
> > this
> >
> > ....
> >
> >
> > - After 30 days alert is sent to author if no activity/comment not
> >
> > just
> >
> > commits.They would still have something like a week
> >
> > - If code is complicated they don’t have to finish it just simply
> >
> > comment on PR to stop it from being closed
> >
> > - I like this because when they comment they can say things like.
> >
> > “Sorry
> >
> > this is taking longer because of problem XYZ I’m having” others in the
> > community can see this and offer input so it helps on collaboration
> >
> > - This also helps people watching the PRs and interested in using
> >
> > them
> >
> > have a much more clear and accurate understanding of where the PR
> >
> > actually
> >
> > stands progress wise so they can more accurately determine when it will
> >
> > be
> >
> > available to them
> >
> > - To your last point, which is a good one, they would simply comment
> >
> > with a gentle reminder that they would like for someone to review.
> >
> >
> > Thoughts? I’m sure I’m missing something there but that’s sort of how
> >
> > I
> >
> > imagine it working
> >
> >
> > - Jeremy Dyer
> >
> > Thanks - Jeremy Dyer
> >
> > ________________________________
> > From: Andy LoPresto <alopresto.apa...@gmail.com>
> > Sent: Saturday, September 15, 2018 11:57 AM
> > To: dev@nifi.apache.org
> > Subject: Re: [DISCUSS] Stale PRs
> >
> > Jeremy,
> >
> > What about in the scenario where the submitter does everything and we
> >
> > (the committers) are slow? I’m not saying we shouldn’t try to fix all
> >
> > the
> >
> > problems, just that I see a lot more of the latter happening.
> >
> >
> > Andy LoPresto
> > alopre...@apache.org
> > alopresto.apa...@gmail.com
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> >
> > On Sep 15, 2018, at 08:51, Pierre Villard <
> >
> > pierre.villard...@gmail.com>
> >
> > wrote:
> >
> >
> > Andy,
> >
> > Totally get your points. I imagine that introducing this approach
> >
> > would
> >
> > help keeping dynamic exchanges on pull requests.
> >
> > In case a PR needs complex/time consuming work (or in case the
> >
> > author
> >
> > is
> >
> > just not in a position to process comments), I think we could have
> >
> > two
> >
> > approaches:
> > - the PR is considered stale after 60 days but is actually closed
> >
> > one
> >
> > week
> >
> > later. I think it leaves time for someone (ideally the author) to
> >
> > comment
> >
> > and give an update so that the PR is not considered stale anymore,
> >
> > no?
> >
> > - for important PRs, it's possible to "remove" this mechanism using
> > specific labels but I guess we would have to ask ASF infra if we
> >
> > want
> >
> > to
> >
> > have rights to add labels on PRs (?)
> >
> > Pierre
> >
> >
> >
> >
> > Le sam. 15 sept. 2018 à 17:44, Andy LoPresto <
> >
> > alopresto.apa...@gmail.com> a
> >
> > écrit :
> >
> > Pierre,
> >
> > I’m going to delay my response on that proposal while I ask for
> >
> > (aka
> >
> > should gather on my own) some information. Is that really our
> >
> > problem?
> >
> > By
> >
> > that, I mean are stale PRs where we are getting bogged down? I am
> >
> > sure
> >
> > there are some old ones that should be closed out. My larger
> >
> > concern
> >
> > is
> >
> > that even new PRs don’t get reviewed immediately for a number of
> >
> > reasons.
> >
> >
> > 1. Balance of committers to submissions. As the project continues
> >
> > to
> >
> > grow,
> >
> > we have far more people providing code than can review it.
> > 2. Quality of PR. Not that the code is necessarily bad, but the PR
> >
> > doesn’t
> >
> > clearly explain the problem and how they are solving it, provide
> >
> > test
> >
> > cases, provide templates or a Docker container if interacting with
> >
> > an
> >
> > external service, etc. All of these things add up to make the cost
> >
> > of
> >
> > reviewing higher.
> > 3. What PRs cover. Previously, there was still a lot of low-hanging
> >
> > fruit,
> >
> > and less complexity. While the project is still fairly cleanly
> >
> > organized,
> >
> > many PRs now are less “add this simple functionality” and more
> >
> > “improve
> >
> > this complicated feature.”
> >
> > I guess I would not have a problem with your proposal, but I do
> >
> > wonder
> >
> > if
> >
> > there are more effective ways to reduce the backlog by identifying
> >
> > other
> >
> > areas of improvement.
> >
> > Andy LoPresto
> > alopre...@apache.org
> > alopresto.apa...@gmail.com
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> >
> > On Sep 15, 2018, at 08:33, Pierre Villard <
> >
> > pierre.villard...@gmail.com>
> >
> > wrote:
> >
> >
> > Hi,
> >
> > The number of open PRs is still growing and it could make think
> >
> > people
> >
> > that
> >
> > the project is not healthy/active (even though a quick look at the
> >
> > last
> >
> > commit date or the commits rate is a clear indication that the
> >
> > project is
> >
> > healthy).
> >
> > In order to encourage people to review code and keep active
> >
> > discussions
> >
> > on
> >
> > the PRs, I suggest to find a way to keep this number as small as
> >
> > possible.
> >
> > To do so, I'd like to ask the wider community if the approach
> >
> > taken
> >
> > by a
> >
> > project like Apache Beam would be a good idea:
> >
> > "A pull request becomes stale after its author fails to respond to
> > actionable comments for 60 days. Author of a closed pull request
> >
> > is
> >
> > welcome
> >
> > to reopen the same pull request again in the future."
> >
> > This approach is managed by a file [1] in the .github directory of
> >
> > the
> >
> > repository.
> >
> > What do you think about this approach?
> >
> > [1] https://github.com/apache/beam/blob/master/.github/stale.yml
> >
> > Pierre
> >
> >
> >
> >
> >
> >

Reply via email to