Hi Josep,

Thanks for looking into this. This is clearly one aspect where we need
to improve.

We had a similar initiative last year
(https://lists.apache.org/thread/66yj9m6tcyz8zqb3lqlbnr386bqwsopt) and
we closed many PRs. Unfortunately we did not follow up with a process
or automation and we are back to the same situation.

Manually reviewing all these PRs is a huge task, so I think we should
at least partially automate it. I'm not sure if we should manually
review the oldest PRs (pre 2020). There's surely many interesting
things but I wonder if we should instead focus on the more recent ones
as they have a higher chance of 1) still making sense, 2) getting
updates from their authors, 3) needing less rebasing. If something has
been broken since 2016 but we never bothered to fix the PR it means it
can't be anything critical!

Finally as Colin mentioned, it looks like a non negligible chunk of
stale PRs comes from committers and regular contributors. So I'd
suggest we each try to clean our own backlog too.

I wonder if we also need to do something in JIRA. Querying for
unresolved tickets returns over 4000 items. Considering we're not
quite at KAFKA-15000 yet, that's a lot.

Thanks,
Mickael


On Tue, Jun 6, 2023 at 11:35 AM Josep Prat <josep.p...@aiven.io.invalid> wrote:
>
> Hi Devs,
> I would say we can split the problem in 2.
>
> Waiting for Author feedback:
> We could have a bot that would ping authors for the cases where we have PRs
> that are stalled and have either:
> - Merge conflict
> - Unaddressed reviews
>
> Waiting for reviewers:
> For the PRs where we have no reviewers and there are no conflicts, I think
> we would need some human interaction to determine modules (maybe this can
> be automated) and ping people who could review.
>
> What do you think?
>
> Best,
>
> On Tue, Jun 6, 2023 at 11:30 AM Josep Prat <josep.p...@aiven.io> wrote:
>
> > Hi Nikolay,
> >
> > With a bot it will be complicated to determine what to do when the PR
> > author is waiting for a reviewer. If a person goes over them, can check if
> > they are waiting for reviews and tag the PR accordingly and maybe ping a
> > maintainer.
> > If you look at my last email I described a flow (but AFAIU it will work
> > only if a human executes it) where the situation you point out would be
> > covered.
> >
> > ———
> > Josep Prat
> >
> > Aiven Deutschland GmbH
> >
> > Alexanderufer 3-7, 10117 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> >
> > m: +491715557497
> >
> > w: aiven.io
> >
> > e: josep.p...@aiven.io
> >
> > On Tue, Jun 6, 2023, 11:20 Николай Ижиков <nizhi...@apache.org> wrote:
> >
> >> Hello.
> >>
> >> What is actions of contributor if no feedback given? [1], [2]
> >>
> >> [1] https://github.com/apache/kafka/pull/13278
> >> [2] https://github.com/apache/kafka/pull/13247
> >>
> >> > 2 июня 2023 г., в 23:38, David Arthur <mum...@gmail.com> написал(а):
> >> >
> >> > I think this is a great idea. If we don’t want the auto-close
> >> > functionality, we can set it to -1
> >> >
> >> > I realize this isn’t a vote, but I’m +1 on this
> >> >
> >> > -David
> >> >
> >> > On Fri, Jun 2, 2023 at 15:34 Colin McCabe <cmcc...@apache.org> wrote:
> >> >
> >> >> That should read "30 days without activity"
> >> >>
> >> >> (I am assuming we have the ability to determine when a PR was last
> >> updated
> >> >> on GH)
> >> >>
> >> >> best,
> >> >> Colin
> >> >>
> >> >> On Fri, Jun 2, 2023, at 12:32, Colin McCabe wrote:
> >> >>> Hi all,
> >> >>>
> >> >>> Looking at GitHub, I have a bunch of Kafka PRs of my own that I've
> >> >>> allowed to become stale, and I guess are pushing up these numbers!
> >> >>> Overall I support the goal of putting a time limit on PRs, just so
> >> that
> >> >>> we can focus our review bandwidth.
> >> >>>
> >> >>> It may be best to start with a simple approach where we mark PRs as
> >> >>> stale after 30 days and email the submitter at that time. And then
> >> >>> delete after 60 days. (Of course the exact time periods might be
> >> >>> something gother than 30/60 but this is just an initial suggestion)
> >> >>>
> >> >>> best,
> >> >>> Colin
> >> >>>
> >> >>>
> >> >>> On Fri, Jun 2, 2023, at 00:37, Josep Prat wrote:
> >> >>>> Hi all,
> >> >>>>
> >> >>>> I want to say that in my experience, I always felt better as a
> >> >> contributor
> >> >>>> when a person told me something than when a bot did. That being said,
> >> >> I'm
> >> >>>> not against bots, and I think they might be a great solution once we
> >> >> have a
> >> >>>> manageable number of open PRs.
> >> >>>>
> >> >>>> Another great question that adding a bot poses is types of staleness
> >> >>>> detection. How do we distinguish between staleness from the author's
> >> >> side
> >> >>>> or from the lack of reviewers/maintainers' side? That's why I started
> >> >> with
> >> >>>> a human approach to be able to distinguish between these 2 cases.
> >> Both
> >> >>>> situations should have different messages and actually different
> >> >> intended
> >> >>>> receivers. In case of staleness because of author inactivity, the
> >> >> message
> >> >>>> should encourage the author to update the PR with the requested
> >> changes
> >> >> or
> >> >>>> resolve the conflicts. But In many cases, PRs are stale because of
> >> lack
> >> >> of
> >> >>>> reviewers. This would need a different message, targeting
> >> maintainers.
> >> >>>>
> >> >>>> Ideally (with bot or not) I believe the process should be as follows:
> >> >>>> - Check PRs that are stale.
> >> >>>> - See if they have labels, if not add proper labels (connect, core,
> >> >>>> clients...)
> >> >>>> -  PR has merge conflicts
> >> >>>> -- Merge conflicts exist and target files that still exist, ping the
> >> >> author
> >> >>>> asking for conflict resolution and add some additional label like
> >> >> `stale`.
> >> >>>> -- Merge conflicts exist and target files that do not exist anymore,
> >> let
> >> >>>> the author know that this PR is obsolete, label the PR as 'obsolete'
> >> and
> >> >>>> close the PR.
> >> >>>> - PR is mergeable, check whose action is needed (author or reviewers)
> >> >>>> -- Author: let the author know that there are pending comments to
> >> >> address.
> >> >>>> Add some additional label, maybe `stale` again
> >> >>>> -- Reviewer: ping some reviewers that have experience or lately
> >> touched
> >> >>>> this piece of the codebase, add a label `reviewer needed` or
> >> something
> >> >>>> similar
> >> >>>> - PRs that have `stale` label after X days, will be closed.
> >> >>>>
> >> >>>> Regarding the comments about only committers and collaborators being
> >> >> able
> >> >>>> to label PRs, this is true, not everyone can do this. However, this
> >> >> could
> >> >>>> be a great opportunity for the newly appointed contributors to
> >> exercise
> >> >>>> their new permissions :)
> >> >>>>
> >> >>>> Let me know if it makes sense to you all.
> >> >>>>
> >> >>>> Best,
> >> >>
> >> > --
> >> > David Arthur
> >>
> >>
>
> --
> [image: Aiven] <https://www.aiven.io>
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io <https://www.aiven.io>   |   <https://www.facebook.com/aivencloud>
>   <https://www.linkedin.com/company/aiven/>   <https://twitter.com/aiven_io>
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B

Reply via email to