Re: [DISCUSS] PR Backlog reduction

2019-05-31 Thread Neal Richardson
There are also some "dashboards" (mostly just saved filters, in list view)
here: https://cwiki.apache.org/confluence/display/ARROW/Dashboards

They're not great either, but at least they can give us some common views
to pay attention to.

Neal


On Fri, May 31, 2019 at 5:56 AM Wes McKinney  wrote:

> On Fri, May 31, 2019 at 3:21 AM Antoine Pitrou  wrote:
> >
> >
> > Le 31/05/2019 à 10:04, Fan Liya a écrit :
> > > @Antoine Pitrou, you mean the titles of JIRA/PR should be chosen
> carefully?
> >
> > That helps a bit for visual filtering, but visual filtering quickly
> > becomes inefficient if there are too many issues.
> >
> > No, I mean having views that only display certain kinds of issues (for
> > example I'm interested in C++ and Python issues, not so much the other
> > kinds).
> >
> > We also currently don't record any interesting information wrt.
> > priority.  So if you take a look at the "Issues to do" list for 0.14.0,
> > for example, you get an (unordered?) list of 252 issues, which is
> > *truncated* by the stupid JIRA software:
> > https://issues.apache.org/jira/projects/ARROW/versions/12344925
> >
> > ... and when you click on "View in Issue Navigator" to get the
> > untruncated list, you get an unhelpful text box with a SQL-query and no
> > easy way to refine your search.  Not to mention the annoying pagination
> > with tiny links at the bottom middle-left, and the general UI slowness.
> >
>
> That is annoying. I always start from the unfiltered Basic issue view with
>
> Project = ARROW
> Components = C++, Python
>
> (maybe with Fix Version = 0.14.0)
>
> You can then click "Save As" in the dropdown toward the top of the
> page and save this search so you can navigate to it easily next time.
>
> > Regards
> >
> > Antoine.
>


Re: [DISCUSS] PR Backlog reduction

2019-05-31 Thread Wes McKinney
On Fri, May 31, 2019 at 3:21 AM Antoine Pitrou  wrote:
>
>
> Le 31/05/2019 à 10:04, Fan Liya a écrit :
> > @Antoine Pitrou, you mean the titles of JIRA/PR should be chosen carefully?
>
> That helps a bit for visual filtering, but visual filtering quickly
> becomes inefficient if there are too many issues.
>
> No, I mean having views that only display certain kinds of issues (for
> example I'm interested in C++ and Python issues, not so much the other
> kinds).
>
> We also currently don't record any interesting information wrt.
> priority.  So if you take a look at the "Issues to do" list for 0.14.0,
> for example, you get an (unordered?) list of 252 issues, which is
> *truncated* by the stupid JIRA software:
> https://issues.apache.org/jira/projects/ARROW/versions/12344925
>
> ... and when you click on "View in Issue Navigator" to get the
> untruncated list, you get an unhelpful text box with a SQL-query and no
> easy way to refine your search.  Not to mention the annoying pagination
> with tiny links at the bottom middle-left, and the general UI slowness.
>

That is annoying. I always start from the unfiltered Basic issue view with

Project = ARROW
Components = C++, Python

(maybe with Fix Version = 0.14.0)

You can then click "Save As" in the dropdown toward the top of the
page and save this search so you can navigate to it easily next time.

> Regards
>
> Antoine.


Re: [DISCUSS] PR Backlog reduction

2019-05-31 Thread Antoine Pitrou


Le 31/05/2019 à 10:04, Fan Liya a écrit :
> @Antoine Pitrou, you mean the titles of JIRA/PR should be chosen carefully?

That helps a bit for visual filtering, but visual filtering quickly
becomes inefficient if there are too many issues.

No, I mean having views that only display certain kinds of issues (for
example I'm interested in C++ and Python issues, not so much the other
kinds).

We also currently don't record any interesting information wrt.
priority.  So if you take a look at the "Issues to do" list for 0.14.0,
for example, you get an (unordered?) list of 252 issues, which is
*truncated* by the stupid JIRA software:
https://issues.apache.org/jira/projects/ARROW/versions/12344925

... and when you click on "View in Issue Navigator" to get the
untruncated list, you get an unhelpful text box with a SQL-query and no
easy way to refine your search.  Not to mention the annoying pagination
with tiny links at the bottom middle-left, and the general UI slowness.

Regards

Antoine.


Re: [DISCUSS] PR Backlog reduction

2019-05-31 Thread Fan Liya
@Antoine Pitrou, you mean the titles of JIRA/PR should be chosen carefully?

Best,
Liya Fan

On Fri, May 31, 2019 at 12:03 AM Antoine Pitrou  wrote:

>
> One of the aspects of the problem is that our tools (Github, JIRA) don't
> allow us to work with categories easily.
>
> Regards
>
> Antoine.
>
>
> Le 30/05/2019 à 15:59, Wes McKinney a écrit :
> > They're complementary. At least in the short term the spreadsheet can
> > help us get our current backlog under control. I'd like to at least be
> > thinking about tools that can help us when patch volume inevitably
> > grows to 2-3 times the current level.
> >
> > On Thu, May 30, 2019 at 12:28 AM Micah Kornfield 
> wrote:
> >>
> >> That sounds great Wes, than you and your team for taking it on.
> >>
> >> Can you clarify, if you would prefer this approach to the one I
> proposed above (i.e. should I delete the spreadsheet) or are they
> complementary?
> >>
> >> Thanks,
> >> Micah
> >>
> >> On Wed, May 29, 2019 at 12:07 PM Wes McKinney 
> wrote:
> >>>
> >>> On the call today we discussed possibly repurposing the Spark PR
> >>> dashboard application for our use
> >>>
> >>> * https://github.com/databricks/spark-pr-dashboard
> >>> * https://spark-prs.appspot.com/
> >>>
> >>> This is a project that my team could take on this year sometime
> >>>
> >>> On Wed, May 29, 2019 at 4:12 AM Fan Liya  wrote:
> 
>  Sounds like a great idea. I am interested in Java PRs.
> 
>  Best,
>  Liya Fan
> 
>  On Wed, May 29, 2019 at 1:28 PM Micah Kornfield <
> emkornfi...@gmail.com>
>  wrote:
> 
> > Sorry for the delay.  I created
> >
> >
> https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
> > as
> > simple way to distribute old PRs if you are interested in helping,
> please
> > add a comment under the language and I'll add you.
> >
> > PMC/Committers, I can share edit access if you let me know which
> e-mail
> > account I should grant access to.
> >
> > Thanks,
> > Micah
> >
> > On Tue, May 21, 2019 at 9:22 PM Micah Kornfield <
> emkornfi...@gmail.com>
> > wrote:
> >
> >> I agree on hand curation for now.
> >>
> >>  I'll try to setup a sign up spreadsheet for shepherding old PRs
> and once
> >> that done assign reviewers/ping old PRs.  I expect to have
> something to
> >> share by the weekend.
> >>
> >> On Tuesday, May 21, 2019, Wes McKinney  wrote:
> >>
> >>> I think maintainers or contributors should be responsible for
> closing
> >>> PRs, it also helps with backlog curation (sometimes when a stale
> PR is
> >>> closed the JIRA may also be closed if it's a Won't Fix)
> >>>
> >>> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou  >
> >>> wrote:
> 
> 
> 
>  Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> > Automatically close stale PRs? https://github.com/probot/stale
> 
>  That doesn't sound like a good idea to me.
> 
>  Regards
> 
>  Antoine.
> >>>
> >>
> >
>


Re: [DISCUSS] PR Backlog reduction

2019-05-30 Thread Antoine Pitrou


One of the aspects of the problem is that our tools (Github, JIRA) don't
allow us to work with categories easily.

Regards

Antoine.


Le 30/05/2019 à 15:59, Wes McKinney a écrit :
> They're complementary. At least in the short term the spreadsheet can
> help us get our current backlog under control. I'd like to at least be
> thinking about tools that can help us when patch volume inevitably
> grows to 2-3 times the current level.
> 
> On Thu, May 30, 2019 at 12:28 AM Micah Kornfield  
> wrote:
>>
>> That sounds great Wes, than you and your team for taking it on.
>>
>> Can you clarify, if you would prefer this approach to the one I proposed 
>> above (i.e. should I delete the spreadsheet) or are they complementary?
>>
>> Thanks,
>> Micah
>>
>> On Wed, May 29, 2019 at 12:07 PM Wes McKinney  wrote:
>>>
>>> On the call today we discussed possibly repurposing the Spark PR
>>> dashboard application for our use
>>>
>>> * https://github.com/databricks/spark-pr-dashboard
>>> * https://spark-prs.appspot.com/
>>>
>>> This is a project that my team could take on this year sometime
>>>
>>> On Wed, May 29, 2019 at 4:12 AM Fan Liya  wrote:

 Sounds like a great idea. I am interested in Java PRs.

 Best,
 Liya Fan

 On Wed, May 29, 2019 at 1:28 PM Micah Kornfield 
 wrote:

> Sorry for the delay.  I created
>
> https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
> as
> simple way to distribute old PRs if you are interested in helping, please
> add a comment under the language and I'll add you.
>
> PMC/Committers, I can share edit access if you let me know which e-mail
> account I should grant access to.
>
> Thanks,
> Micah
>
> On Tue, May 21, 2019 at 9:22 PM Micah Kornfield 
> wrote:
>
>> I agree on hand curation for now.
>>
>>  I'll try to setup a sign up spreadsheet for shepherding old PRs and once
>> that done assign reviewers/ping old PRs.  I expect to have something to
>> share by the weekend.
>>
>> On Tuesday, May 21, 2019, Wes McKinney  wrote:
>>
>>> I think maintainers or contributors should be responsible for closing
>>> PRs, it also helps with backlog curation (sometimes when a stale PR is
>>> closed the JIRA may also be closed if it's a Won't Fix)
>>>
>>> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou 
>>> wrote:



 Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> Automatically close stale PRs? https://github.com/probot/stale

 That doesn't sound like a good idea to me.

 Regards

 Antoine.
>>>
>>
>


Re: [DISCUSS] PR Backlog reduction

2019-05-30 Thread Wes McKinney
They're complementary. At least in the short term the spreadsheet can
help us get our current backlog under control. I'd like to at least be
thinking about tools that can help us when patch volume inevitably
grows to 2-3 times the current level.

On Thu, May 30, 2019 at 12:28 AM Micah Kornfield  wrote:
>
> That sounds great Wes, than you and your team for taking it on.
>
> Can you clarify, if you would prefer this approach to the one I proposed 
> above (i.e. should I delete the spreadsheet) or are they complementary?
>
> Thanks,
> Micah
>
> On Wed, May 29, 2019 at 12:07 PM Wes McKinney  wrote:
>>
>> On the call today we discussed possibly repurposing the Spark PR
>> dashboard application for our use
>>
>> * https://github.com/databricks/spark-pr-dashboard
>> * https://spark-prs.appspot.com/
>>
>> This is a project that my team could take on this year sometime
>>
>> On Wed, May 29, 2019 at 4:12 AM Fan Liya  wrote:
>> >
>> > Sounds like a great idea. I am interested in Java PRs.
>> >
>> > Best,
>> > Liya Fan
>> >
>> > On Wed, May 29, 2019 at 1:28 PM Micah Kornfield 
>> > wrote:
>> >
>> > > Sorry for the delay.  I created
>> > >
>> > > https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
>> > > as
>> > > simple way to distribute old PRs if you are interested in helping, please
>> > > add a comment under the language and I'll add you.
>> > >
>> > > PMC/Committers, I can share edit access if you let me know which e-mail
>> > > account I should grant access to.
>> > >
>> > > Thanks,
>> > > Micah
>> > >
>> > > On Tue, May 21, 2019 at 9:22 PM Micah Kornfield 
>> > > wrote:
>> > >
>> > > > I agree on hand curation for now.
>> > > >
>> > > >  I'll try to setup a sign up spreadsheet for shepherding old PRs and 
>> > > > once
>> > > > that done assign reviewers/ping old PRs.  I expect to have something to
>> > > > share by the weekend.
>> > > >
>> > > > On Tuesday, May 21, 2019, Wes McKinney  wrote:
>> > > >
>> > > >> I think maintainers or contributors should be responsible for closing
>> > > >> PRs, it also helps with backlog curation (sometimes when a stale PR is
>> > > >> closed the JIRA may also be closed if it's a Won't Fix)
>> > > >>
>> > > >> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou 
>> > > >> wrote:
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > Le 21/05/2019 à 20:02, Neal Richardson a écrit :
>> > > >> > > Automatically close stale PRs? https://github.com/probot/stale
>> > > >> >
>> > > >> > That doesn't sound like a good idea to me.
>> > > >> >
>> > > >> > Regards
>> > > >> >
>> > > >> > Antoine.
>> > > >>
>> > > >
>> > >


Re: [DISCUSS] PR Backlog reduction

2019-05-29 Thread Micah Kornfield
That sounds great Wes, than you and your team for taking it on.

Can you clarify, if you would prefer this approach to the one I proposed
above (i.e. should I delete the spreadsheet) or are they complementary?

Thanks,
Micah

On Wed, May 29, 2019 at 12:07 PM Wes McKinney  wrote:

> On the call today we discussed possibly repurposing the Spark PR
> dashboard application for our use
>
> * https://github.com/databricks/spark-pr-dashboard
> * https://spark-prs.appspot.com/
>
> This is a project that my team could take on this year sometime
>
> On Wed, May 29, 2019 at 4:12 AM Fan Liya  wrote:
> >
> > Sounds like a great idea. I am interested in Java PRs.
> >
> > Best,
> > Liya Fan
> >
> > On Wed, May 29, 2019 at 1:28 PM Micah Kornfield 
> > wrote:
> >
> > > Sorry for the delay.  I created
> > >
> > >
> https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
> > > as
> > > simple way to distribute old PRs if you are interested in helping,
> please
> > > add a comment under the language and I'll add you.
> > >
> > > PMC/Committers, I can share edit access if you let me know which e-mail
> > > account I should grant access to.
> > >
> > > Thanks,
> > > Micah
> > >
> > > On Tue, May 21, 2019 at 9:22 PM Micah Kornfield  >
> > > wrote:
> > >
> > > > I agree on hand curation for now.
> > > >
> > > >  I'll try to setup a sign up spreadsheet for shepherding old PRs and
> once
> > > > that done assign reviewers/ping old PRs.  I expect to have something
> to
> > > > share by the weekend.
> > > >
> > > > On Tuesday, May 21, 2019, Wes McKinney  wrote:
> > > >
> > > >> I think maintainers or contributors should be responsible for
> closing
> > > >> PRs, it also helps with backlog curation (sometimes when a stale PR
> is
> > > >> closed the JIRA may also be closed if it's a Won't Fix)
> > > >>
> > > >> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou 
> > > >> wrote:
> > > >> >
> > > >> >
> > > >> >
> > > >> > Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> > > >> > > Automatically close stale PRs? https://github.com/probot/stale
> > > >> >
> > > >> > That doesn't sound like a good idea to me.
> > > >> >
> > > >> > Regards
> > > >> >
> > > >> > Antoine.
> > > >>
> > > >
> > >
>


Re: [DISCUSS] PR Backlog reduction

2019-05-29 Thread Wes McKinney
On the call today we discussed possibly repurposing the Spark PR
dashboard application for our use

* https://github.com/databricks/spark-pr-dashboard
* https://spark-prs.appspot.com/

This is a project that my team could take on this year sometime

On Wed, May 29, 2019 at 4:12 AM Fan Liya  wrote:
>
> Sounds like a great idea. I am interested in Java PRs.
>
> Best,
> Liya Fan
>
> On Wed, May 29, 2019 at 1:28 PM Micah Kornfield 
> wrote:
>
> > Sorry for the delay.  I created
> >
> > https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
> > as
> > simple way to distribute old PRs if you are interested in helping, please
> > add a comment under the language and I'll add you.
> >
> > PMC/Committers, I can share edit access if you let me know which e-mail
> > account I should grant access to.
> >
> > Thanks,
> > Micah
> >
> > On Tue, May 21, 2019 at 9:22 PM Micah Kornfield 
> > wrote:
> >
> > > I agree on hand curation for now.
> > >
> > >  I'll try to setup a sign up spreadsheet for shepherding old PRs and once
> > > that done assign reviewers/ping old PRs.  I expect to have something to
> > > share by the weekend.
> > >
> > > On Tuesday, May 21, 2019, Wes McKinney  wrote:
> > >
> > >> I think maintainers or contributors should be responsible for closing
> > >> PRs, it also helps with backlog curation (sometimes when a stale PR is
> > >> closed the JIRA may also be closed if it's a Won't Fix)
> > >>
> > >> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou 
> > >> wrote:
> > >> >
> > >> >
> > >> >
> > >> > Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> > >> > > Automatically close stale PRs? https://github.com/probot/stale
> > >> >
> > >> > That doesn't sound like a good idea to me.
> > >> >
> > >> > Regards
> > >> >
> > >> > Antoine.
> > >>
> > >
> >


Re: [DISCUSS] PR Backlog reduction

2019-05-29 Thread Fan Liya
Sounds like a great idea. I am interested in Java PRs.

Best,
Liya Fan

On Wed, May 29, 2019 at 1:28 PM Micah Kornfield 
wrote:

> Sorry for the delay.  I created
>
> https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
> as
> simple way to distribute old PRs if you are interested in helping, please
> add a comment under the language and I'll add you.
>
> PMC/Committers, I can share edit access if you let me know which e-mail
> account I should grant access to.
>
> Thanks,
> Micah
>
> On Tue, May 21, 2019 at 9:22 PM Micah Kornfield 
> wrote:
>
> > I agree on hand curation for now.
> >
> >  I'll try to setup a sign up spreadsheet for shepherding old PRs and once
> > that done assign reviewers/ping old PRs.  I expect to have something to
> > share by the weekend.
> >
> > On Tuesday, May 21, 2019, Wes McKinney  wrote:
> >
> >> I think maintainers or contributors should be responsible for closing
> >> PRs, it also helps with backlog curation (sometimes when a stale PR is
> >> closed the JIRA may also be closed if it's a Won't Fix)
> >>
> >> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou 
> >> wrote:
> >> >
> >> >
> >> >
> >> > Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> >> > > Automatically close stale PRs? https://github.com/probot/stale
> >> >
> >> > That doesn't sound like a good idea to me.
> >> >
> >> > Regards
> >> >
> >> > Antoine.
> >>
> >
>


Re: [DISCUSS] PR Backlog reduction

2019-05-28 Thread Micah Kornfield
Sorry for the delay.  I created
https://docs.google.com/spreadsheets/d/146lDg11c5ohgVkrOglrb42a1JB0Gm1qBRbnoDlvB8QY/edit#gid=0
as
simple way to distribute old PRs if you are interested in helping, please
add a comment under the language and I'll add you.

PMC/Committers, I can share edit access if you let me know which e-mail
account I should grant access to.

Thanks,
Micah

On Tue, May 21, 2019 at 9:22 PM Micah Kornfield 
wrote:

> I agree on hand curation for now.
>
>  I'll try to setup a sign up spreadsheet for shepherding old PRs and once
> that done assign reviewers/ping old PRs.  I expect to have something to
> share by the weekend.
>
> On Tuesday, May 21, 2019, Wes McKinney  wrote:
>
>> I think maintainers or contributors should be responsible for closing
>> PRs, it also helps with backlog curation (sometimes when a stale PR is
>> closed the JIRA may also be closed if it's a Won't Fix)
>>
>> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou 
>> wrote:
>> >
>> >
>> >
>> > Le 21/05/2019 à 20:02, Neal Richardson a écrit :
>> > > Automatically close stale PRs? https://github.com/probot/stale
>> >
>> > That doesn't sound like a good idea to me.
>> >
>> > Regards
>> >
>> > Antoine.
>>
>


Re: [DISCUSS] PR Backlog reduction

2019-05-21 Thread Micah Kornfield
I agree on hand curation for now.

 I'll try to setup a sign up spreadsheet for shepherding old PRs and once
that done assign reviewers/ping old PRs.  I expect to have something to
share by the weekend.

On Tuesday, May 21, 2019, Wes McKinney  wrote:

> I think maintainers or contributors should be responsible for closing
> PRs, it also helps with backlog curation (sometimes when a stale PR is
> closed the JIRA may also be closed if it's a Won't Fix)
>
> On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou  wrote:
> >
> >
> >
> > Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> > > Automatically close stale PRs? https://github.com/probot/stale
> >
> > That doesn't sound like a good idea to me.
> >
> > Regards
> >
> > Antoine.
>


Re: [DISCUSS] PR Backlog reduction

2019-05-21 Thread Wes McKinney
I think maintainers or contributors should be responsible for closing
PRs, it also helps with backlog curation (sometimes when a stale PR is
closed the JIRA may also be closed if it's a Won't Fix)

On Tue, May 21, 2019 at 1:12 PM Antoine Pitrou  wrote:
>
>
>
> Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> > Automatically close stale PRs? https://github.com/probot/stale
>
> That doesn't sound like a good idea to me.
>
> Regards
>
> Antoine.


Re: [DISCUSS] PR Backlog reduction

2019-05-21 Thread Antoine Pitrou



Le 21/05/2019 à 20:02, Neal Richardson a écrit :
> Automatically close stale PRs? https://github.com/probot/stale

That doesn't sound like a good idea to me.

Regards

Antoine.


Re: [DISCUSS] PR Backlog reduction

2019-05-21 Thread Neal Richardson
Automatically close stale PRs? https://github.com/probot/stale

On Tue, May 21, 2019 at 11:00 AM Wes McKinney  wrote:

> Any other thoughts about process to manage the backlog?
>
> On Thu, May 16, 2019 at 2:58 PM Wes McKinney  wrote:
> >
> > hi Micah,
> >
> > This sounds like a reasonable proposal, and I agree in particular for
> > regular contributors that it makes sense to close PRs that are not
> > close to being in merge-readiness to thin the noise of the patch queue
> >
> > We have some short-term issues such as various reviewers being busy
> > lately (e.g. I was on vacation in April, then heads down working on
> > ARROW-3144) but I agree that there are some structural issues with how
> > we're organizing code review efforts.
> >
> > Note that Apache Spark, with ~500 open PRs, created this dashboard
> > application to help manage the insanity
> >
> > https://spark-prs.appspot.com/
> >
> > Ultimately (in the next few years as the number of active contributors
> > grows) I expect that we'll have to do something similar.
> >
> > - Wes
> >
> > On Thu, May 16, 2019 at 2:34 PM Micah Kornfield 
> wrote:
> > >
> > > Our backlog of open PRs is slowly creeping up.  This isn't great
> because it
> > > allows contributions to slip through the cracks (which in turn possibly
> > > turns off new contributors).  Perusing PRs I think things roughly fall
> into
> > > the following categories.
> > >
> > >
> > > 1.  PRs are work in progress that never got completed but were left
> open
> > > (mostly by regular arrow contributors).
> > >
> > > 2.  PR stalled because changes where requested and the PR author never
> > > responded.
> > >
> > > 3.  PR stalled due to lack of consensus on approach/design.
> > >
> > > 4.  PR is blocked on some external dependency (mostly these are PRs by
> > > regular arrow contributor).
> > >
> > >
> > > A straw-man proposal for handling these:
> > >
> > > 1.  Regular arrow contributors, please close the PR if it isn't close
> to
> > > being ready and you aren't actively working on it.
> > >
> > > 2.  I think we should start assigning reviewers who will have the
> > > responsibility of:
> > >
> > >a.  Pinging contributor and working through the review with them.
> > >
> > >b.  Closing out the PR in some form if there hasn't been activity
> in a
> > > 30 day period (either merging as is, making the necessary changes or
> > > closing the PR, and removing the tag from JIRA).
> > >
> > > 3.  Same as 2, but bring the discussion to the mailing list and try to
> have
> > > a formal vote if necessary.
> > >
> > > 4.  Same as 2, but tag the PR as blocked and the time window expands.
> > >
> > >
> > > The question comes up with how to manage assignment of PRs to
> reviewers.  I
> > > am happy to try to triage any PRs older then a week (assuming some PRs
> will
> > > be closed quickly with the current ad-hoc process) and load balance
> between
> > > volunteers (it would be great to have a doc someplace where people can
> > > express there available bandwidth and which languages they feel
> comfortable
> > > with).
> > >
> > >
> > > Thoughts/other proposals?
> > >
> > >
> > > Thanks,
> > >
> > > Micah
> > >
> > >
> > >
> > > P.S. A very rough analysis of PR tags gives the following counts.
> > >
> > >   29 C++
> > >
> > >   17 Python
> > >
> > >8 Rust
> > >
> > >7 WIP
> > >
> > >7 Plasma
> > >
> > >7 Java
> > >
> > >5 R
> > >
> > >4 Go
> > >
> > >4 Flight
>


Re: [DISCUSS] PR Backlog reduction

2019-05-21 Thread Wes McKinney
Any other thoughts about process to manage the backlog?

On Thu, May 16, 2019 at 2:58 PM Wes McKinney  wrote:
>
> hi Micah,
>
> This sounds like a reasonable proposal, and I agree in particular for
> regular contributors that it makes sense to close PRs that are not
> close to being in merge-readiness to thin the noise of the patch queue
>
> We have some short-term issues such as various reviewers being busy
> lately (e.g. I was on vacation in April, then heads down working on
> ARROW-3144) but I agree that there are some structural issues with how
> we're organizing code review efforts.
>
> Note that Apache Spark, with ~500 open PRs, created this dashboard
> application to help manage the insanity
>
> https://spark-prs.appspot.com/
>
> Ultimately (in the next few years as the number of active contributors
> grows) I expect that we'll have to do something similar.
>
> - Wes
>
> On Thu, May 16, 2019 at 2:34 PM Micah Kornfield  wrote:
> >
> > Our backlog of open PRs is slowly creeping up.  This isn't great because it
> > allows contributions to slip through the cracks (which in turn possibly
> > turns off new contributors).  Perusing PRs I think things roughly fall into
> > the following categories.
> >
> >
> > 1.  PRs are work in progress that never got completed but were left open
> > (mostly by regular arrow contributors).
> >
> > 2.  PR stalled because changes where requested and the PR author never
> > responded.
> >
> > 3.  PR stalled due to lack of consensus on approach/design.
> >
> > 4.  PR is blocked on some external dependency (mostly these are PRs by
> > regular arrow contributor).
> >
> >
> > A straw-man proposal for handling these:
> >
> > 1.  Regular arrow contributors, please close the PR if it isn't close to
> > being ready and you aren't actively working on it.
> >
> > 2.  I think we should start assigning reviewers who will have the
> > responsibility of:
> >
> >a.  Pinging contributor and working through the review with them.
> >
> >b.  Closing out the PR in some form if there hasn't been activity in a
> > 30 day period (either merging as is, making the necessary changes or
> > closing the PR, and removing the tag from JIRA).
> >
> > 3.  Same as 2, but bring the discussion to the mailing list and try to have
> > a formal vote if necessary.
> >
> > 4.  Same as 2, but tag the PR as blocked and the time window expands.
> >
> >
> > The question comes up with how to manage assignment of PRs to reviewers.  I
> > am happy to try to triage any PRs older then a week (assuming some PRs will
> > be closed quickly with the current ad-hoc process) and load balance between
> > volunteers (it would be great to have a doc someplace where people can
> > express there available bandwidth and which languages they feel comfortable
> > with).
> >
> >
> > Thoughts/other proposals?
> >
> >
> > Thanks,
> >
> > Micah
> >
> >
> >
> > P.S. A very rough analysis of PR tags gives the following counts.
> >
> >   29 C++
> >
> >   17 Python
> >
> >8 Rust
> >
> >7 WIP
> >
> >7 Plasma
> >
> >7 Java
> >
> >5 R
> >
> >4 Go
> >
> >4 Flight


Re: [DISCUSS] PR Backlog reduction

2019-05-16 Thread Wes McKinney
hi Micah,

This sounds like a reasonable proposal, and I agree in particular for
regular contributors that it makes sense to close PRs that are not
close to being in merge-readiness to thin the noise of the patch queue

We have some short-term issues such as various reviewers being busy
lately (e.g. I was on vacation in April, then heads down working on
ARROW-3144) but I agree that there are some structural issues with how
we're organizing code review efforts.

Note that Apache Spark, with ~500 open PRs, created this dashboard
application to help manage the insanity

https://spark-prs.appspot.com/

Ultimately (in the next few years as the number of active contributors
grows) I expect that we'll have to do something similar.

- Wes

On Thu, May 16, 2019 at 2:34 PM Micah Kornfield  wrote:
>
> Our backlog of open PRs is slowly creeping up.  This isn't great because it
> allows contributions to slip through the cracks (which in turn possibly
> turns off new contributors).  Perusing PRs I think things roughly fall into
> the following categories.
>
>
> 1.  PRs are work in progress that never got completed but were left open
> (mostly by regular arrow contributors).
>
> 2.  PR stalled because changes where requested and the PR author never
> responded.
>
> 3.  PR stalled due to lack of consensus on approach/design.
>
> 4.  PR is blocked on some external dependency (mostly these are PRs by
> regular arrow contributor).
>
>
> A straw-man proposal for handling these:
>
> 1.  Regular arrow contributors, please close the PR if it isn't close to
> being ready and you aren't actively working on it.
>
> 2.  I think we should start assigning reviewers who will have the
> responsibility of:
>
>a.  Pinging contributor and working through the review with them.
>
>b.  Closing out the PR in some form if there hasn't been activity in a
> 30 day period (either merging as is, making the necessary changes or
> closing the PR, and removing the tag from JIRA).
>
> 3.  Same as 2, but bring the discussion to the mailing list and try to have
> a formal vote if necessary.
>
> 4.  Same as 2, but tag the PR as blocked and the time window expands.
>
>
> The question comes up with how to manage assignment of PRs to reviewers.  I
> am happy to try to triage any PRs older then a week (assuming some PRs will
> be closed quickly with the current ad-hoc process) and load balance between
> volunteers (it would be great to have a doc someplace where people can
> express there available bandwidth and which languages they feel comfortable
> with).
>
>
> Thoughts/other proposals?
>
>
> Thanks,
>
> Micah
>
>
>
> P.S. A very rough analysis of PR tags gives the following counts.
>
>   29 C++
>
>   17 Python
>
>8 Rust
>
>7 WIP
>
>7 Plasma
>
>7 Java
>
>5 R
>
>4 Go
>
>4 Flight