Re: Improving governance / committers (split from Spark Improvement Proposals thread)

2016-10-08 Thread Cody Koeninger
It's not about technical design disagreement as to matters of taste,
it's about familiarity with the domain.  To make an analogy, it's as
if a committer in MLlib was firmly intent on, I dunno, treating a
collection of categorical variables as if it were an ordered range of
continuous variables.  It's just wrong.  That kind of thing, to a
greater or lesser degree, has been going on related to the Kafka
modules, for years.

On Sat, Oct 8, 2016 at 4:11 PM, Matei Zaharia  wrote:
> This makes a lot of sense; just to comment on a few things:
>
>> - More committers
>> Just looking at the ratio of committers to open tickets, or committers
>> to contributors, I don't think you have enough human power.
>> I realize this is a touchy issue.  I don't have dog in this fight,
>> because I'm not on either coast nor in a big company that views
>> committership as a political thing.  I just think you need more people
>> to do the work, and more diversity of viewpoint.
>> It's unfortunate that the Apache governance process involves giving
>> someone all the keys or none of the keys, but until someone really
>> starts screwing up, I think it's better to err on the side of
>> accepting hard-working people.
>
> This is something the PMC is actively discussing. Historically, we've added 
> committers when people contributed a new module or feature, basically to the 
> point where other developers are asking them to review changes in that area 
> (https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-BecomingaCommitter).
>  For example, we added the original authors of GraphX when we merged in 
> GraphX, the authors of new ML algorithms, etc. However, there's a good 
> argument that some areas are simply not covered well now and we should add 
> people there. Also, as the project has grown, there are also more people who 
> focus on smaller fixes and are nonetheless contributing a lot.
>
>> - Each major area of the code needs at least one person who cares
>> about it that is empowered with a vote, otherwise decisions get made
>> that don't make technical sense.
>> I don't know if anyone with a vote is shepherding GraphX (or maybe
>> it's just dead), the Mesos relationship has always been weird, no one
>> with a vote really groks Kafka.
>> marmbrus and zsxwing are getting there quickly on the Kafka side, and
>> I appreciate it, but it's been bad for a while.
>> Because I don't have any political power, my response to seeing things
>> that I know are technically dangerous has been to yell really loud
>> until someone listens, which sucks for everyone involved.
>> I already apologized to Michael privately; Ryan, I'm sorry, it's not about 
>> you.
>> This seems pretty straightforward to fix, if politically awkward:
>> those people exist, just give them a vote.
>> Failing that, listen the first or second time they say something not
>> the third or fourth, and if it doesn't make sense, ask.
>
> Just as a note here -- it's true that some areas are not super well covered, 
> but I also hope to avoid a situation where people have to yell to be listened 
> to. I can't say anything about *all* technical discussions we've ever had, 
> but historically, people have been able to comment on the design of many 
> things without yelling. This is actually important because a culture of 
> having to yell can drive away contributors. So it's awesome that you yelled 
> about the Kafka source stuff, but at the same time, hopefully we make these 
> types of things work without yelling. This would be a problem even if there 
> were committers with more expertise in each area -- what if someone disagrees 
> with the committers?
>
> Matei
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-08 Thread vaquar khan
Matei ,

I like your idea automatically email However it wont solve then problem.

Last few years i saw many enthusiastic sent mail and want to be Apache
Spark contributor ,Our response most welcome here is Jira , go and start
work on issues.
After few days /months struggle they have lost interest because not able to
understand where to start.

1) We should assign mentor to enthusiastic to understand issues and help to
start as contributor.
2) For high priority old Jira we should create separate group , who give
complete analysis in Jira so other can go and fix issue.



Reagrds,
Vaquar khan

On Sat, Oct 8, 2016 at 3:49 PM, Matei Zaharia 
wrote:

> I like this idea of asking them. BTW, one other thing we can do *provided
> the JIRAs are eventually under control* is to create a filter for old JIRAs
> that have not received a response in X amount of time and have the system
> automatically email the dev list with this report every month. Then
> everyone can see the list of items and maybe be reminded to take care to
> clean it up. This only works if the list is manageable and you actually
> want to read all of it.
>
> Matei
>
> On Oct 8, 2016, at 9:01 AM, Cody Koeninger  wrote:
>
> Yeah, I've interacted with other projects that used that system and it was
> pleasant.
>
> 1. "this is getting closed cause its stale, let us know if thats a problem"
> 2. "actually that matters to us"
> 3. "ok well leave it open"
>
> I'd be fine with totally automating step 1 as long as a human was involved
> at step 2 and 3
>
>
> On Saturday, October 8, 2016, assaf.mendelson 
> wrote:
>
>> I don’t really have much experience with large open source projects but I
>> have some experience with having lots of issues with no one handling them.
>> Automation proved a good solution in my experience, but one thing that I
>> found which was really important is giving people a chance to say “don’t
>> close this please”.
>>
>> Basically, because closing you can send an email to the reporter (and
>> probably people who are watching the issue) and tell them this is going to
>> be closed. Allow them an option to ping back saying “don’t close this
>> please” which would ping committers for input (as if there were 5+ votes as
>> described by Nick).
>>
>> The main reason for this is that many times people fine solutions and the
>> issue does become stale but at other times, the issue is still important,
>> it is just that no one noticed it because of the noise of other issues.
>>
>> Thanks,
>>
>> Assaf.
>>
>>
>>
>>
>>
>>
>>
>> *From:* Nicholas Chammas [via Apache Spark Developers List] [mailto:
>> ml-node+[hidden email]
>> ]
>> *Sent:* Saturday, October 08, 2016 12:42 AM
>> *To:* Mendelson, Assaf
>> *Subject:* Re: Improving volunteer management / JIRAs (split from Spark
>> Improvement Proposals thread)
>>
>>
>>
>> I agree with Cody and others that we need some automation — or at least
>> an adjusted process — to help us manage organic contributions better.
>>
>> The objections about automated closing being potentially abrasive are
>> understood, but I wouldn’t accept that as a defeat for automation. Instead,
>> it seems like a constraint we should impose on any proposed solution: Make
>> sure it doesn’t turn contributors off. Rolling as we have been won’t cut
>> it, and I don’t think adding committers will ever be a sufficient solution
>> to this particular problem.
>>
>> To me, it seems like we need a way to filter out viable contributions
>> with community support from other contributions when it comes to deciding
>> that automated action is appropriate. Our current tooling isn’t perfect,
>> but perhaps we can leverage it to create such a filter.
>>
>> For example, consider the following strawman proposal for how to cut down
>> on the number of pending but unviable proposals, and simultaneously help
>> contributors organize to promote viable proposals and get the attention of
>> committers:
>>
>> 1.  Have a bot scan for *stale* JIRA issues and PRs—i.e. they
>> haven’t been updated in 20+ days (or D+ days, if you prefer).
>>
>> 2.  Depending on the level of community support, either close the
>> item or ping specific people for action. Specifically:
>> a. If the JIRA/PR has no input from a committer and the JIRA/PR has 5+
>> votes (or V+ votes), ping committers for input. (For PRs, you could
>> count comments from different people, or thumbs up on the initial PR post.)
>> b. If the JIRA/PR has no input from a committer and the JIRA/PR has less
>> than V votes, close it with a gentle message asking the contributor to
>> solicit support from either the community or a committer, and try again
>> later.
>> c. If the JIRA/PR has input from a committer or committers, ping them for
>> an update.
>>
>> This is just a rough idea. The point is that when contributors have stale
>> proposals that they don’t close, committers need to take action. A little
>> automation to selectivel

Re: Spark Improvement Proposals

2016-10-08 Thread vaquar khan
+1 for SIP lebles,waiting for Reynolds detailed proposal .

Regards,
Vaquar khan

On 8 Oct 2016 16:22, "Matei Zaharia"  wrote:

> Sounds good. Just to comment on the compatibility part:
>
> > I meant changing public user interfaces.  I think the first design is
> > unlikely to be right, because it's done at a time when you have the
> > least information.  As a user, I find it considerably more frustrating
> > to be unable to use a tool to get my job done, than I do having to
> > make minor changes to my code in order to take advantage of features.
> > I've seen committers be seriously reluctant to allow changes to
> > @experimental code that are needed in order for it to really work
> > right.  You need to be able to iterate, and if people on both sides of
> > the fence aren't going to respect that some newer apis are subject to
> > change, then why even mark them as such?
> >
> > Ideally a finished SIP should give me a checklist of things that an
> > implementation must do, and things that it doesn't need to do.
> > Contributors/committers should be seriously discouraged from putting
> > out a version 0.1 that doesn't have at least a prototype
> > implementation of all those things, especially if they're then going
> > to argue against interface changes necessary to get the the rest of
> > the things done in the 0.2 version.
>
> Experimental APIs and alpha components are indeed supposed to be
> changeable (https://cwiki.apache.org/confluence/display/SPARK/
> Spark+Versioning+Policy). Maybe people are being too conservative in some
> cases, but I do want to note that regardless of what precise policy we try
> to write down, this type of issue will ultimately be a judgment call. Is it
> worth making a small cosmetic change in an API that's marked experimental,
> but has been used widely for a year? Perhaps not. Is it worth making it in
> something one month old, or even in an older API as we move to 2.0? Maybe
> yes. I think we should just discuss each one (start an email thread if
> resolving it on JIRA is too complex) and perhaps be more religious about
> making things non-experimental when we think they're done.
>
> Matei
>
>
> >
> >
> > On Fri, Oct 7, 2016 at 2:18 PM, Reynold Xin  wrote:
> >> I like the lightweight proposal to add a SIP label.
> >>
> >> During Spark 2.0 development, Tom (Graves) and I suggested using wiki to
> >> track the list of major changes, but that never really materialized due
> to
> >> the overhead. Adding a SIP label on major JIRAs and then link to them
> >> prominently on the Spark website makes a lot of sense.
> >>
> >>
> >> On Fri, Oct 7, 2016 at 10:50 AM, Matei Zaharia  >
> >> wrote:
> >>>
> >>> For the improvement proposals, I think one major point was to make them
> >>> really visible to users who are not contributors, so we should do more
> than
> >>> sending stuff to dev@. One very lightweight idea is to have a new
> type of
> >>> JIRA called a SIP and have a link to a filter that shows all such
> JIRAs from
> >>> http://spark.apache.org. I also like the idea of SIP and design doc
> >>> templates (in fact many projects have them).
> >>>
> >>> Matei
> >>>
> >>> On Oct 7, 2016, at 10:38 AM, Reynold Xin  wrote:
> >>>
> >>> I called Cody last night and talked about some of the topics in his
> email.
> >>> It became clear to me Cody genuinely cares about the project.
> >>>
> >>> Some of the frustrations come from the success of the project itself
> >>> becoming very "hot", and it is difficult to get clarity from people who
> >>> don't dedicate all their time to Spark. In fact, it is in some ways
> similar
> >>> to scaling an engineering team in a successful startup: old processes
> that
> >>> worked well might not work so well when it gets to a certain size,
> cultures
> >>> can get diluted, building culture vs building process, etc.
> >>>
> >>> I also really like to have a more visible process for larger changes,
> >>> especially major user facing API changes. Historically we upload
> design docs
> >>> for major changes, but it is not always consistent and difficult to
> quality
> >>> of the docs, due to the volunteering nature of the organization.
> >>>
> >>> Some of the more concrete ideas we discussed focus on building a
> culture
> >>> to improve clarity:
> >>>
> >>> - Process: Large changes should have design docs posted on JIRA. One
> thing
> >>> Cody and I didn't discuss but an idea that just came to me is we should
> >>> create a design doc template for the project and ask everybody to
> follow.
> >>> The design doc template should also explicitly list goals and
> non-goals, to
> >>> make design doc more consistent.
> >>>
> >>> - Process: Email dev@ to solicit feedback. We have some this with some
> >>> changes, but again very inconsistent. Just posting something on JIRA
> isn't
> >>> sufficient, because there are simply too many JIRAs and the signal get
> lost
> >>> in the noise. While this is generally impossible to enforce because we
> can't
> >>> force all volun

Re: Spark Improvement Proposals

2016-10-08 Thread Matei Zaharia
Sounds good. Just to comment on the compatibility part:

> I meant changing public user interfaces.  I think the first design is
> unlikely to be right, because it's done at a time when you have the
> least information.  As a user, I find it considerably more frustrating
> to be unable to use a tool to get my job done, than I do having to
> make minor changes to my code in order to take advantage of features.
> I've seen committers be seriously reluctant to allow changes to
> @experimental code that are needed in order for it to really work
> right.  You need to be able to iterate, and if people on both sides of
> the fence aren't going to respect that some newer apis are subject to
> change, then why even mark them as such?
> 
> Ideally a finished SIP should give me a checklist of things that an
> implementation must do, and things that it doesn't need to do.
> Contributors/committers should be seriously discouraged from putting
> out a version 0.1 that doesn't have at least a prototype
> implementation of all those things, especially if they're then going
> to argue against interface changes necessary to get the the rest of
> the things done in the 0.2 version.

Experimental APIs and alpha components are indeed supposed to be changeable 
(https://cwiki.apache.org/confluence/display/SPARK/Spark+Versioning+Policy). 
Maybe people are being too conservative in some cases, but I do want to note 
that regardless of what precise policy we try to write down, this type of issue 
will ultimately be a judgment call. Is it worth making a small cosmetic change 
in an API that's marked experimental, but has been used widely for a year? 
Perhaps not. Is it worth making it in something one month old, or even in an 
older API as we move to 2.0? Maybe yes. I think we should just discuss each one 
(start an email thread if resolving it on JIRA is too complex) and perhaps be 
more religious about making things non-experimental when we think they're done.

Matei


> 
> 
> On Fri, Oct 7, 2016 at 2:18 PM, Reynold Xin  wrote:
>> I like the lightweight proposal to add a SIP label.
>> 
>> During Spark 2.0 development, Tom (Graves) and I suggested using wiki to
>> track the list of major changes, but that never really materialized due to
>> the overhead. Adding a SIP label on major JIRAs and then link to them
>> prominently on the Spark website makes a lot of sense.
>> 
>> 
>> On Fri, Oct 7, 2016 at 10:50 AM, Matei Zaharia 
>> wrote:
>>> 
>>> For the improvement proposals, I think one major point was to make them
>>> really visible to users who are not contributors, so we should do more than
>>> sending stuff to dev@. One very lightweight idea is to have a new type of
>>> JIRA called a SIP and have a link to a filter that shows all such JIRAs from
>>> http://spark.apache.org. I also like the idea of SIP and design doc
>>> templates (in fact many projects have them).
>>> 
>>> Matei
>>> 
>>> On Oct 7, 2016, at 10:38 AM, Reynold Xin  wrote:
>>> 
>>> I called Cody last night and talked about some of the topics in his email.
>>> It became clear to me Cody genuinely cares about the project.
>>> 
>>> Some of the frustrations come from the success of the project itself
>>> becoming very "hot", and it is difficult to get clarity from people who
>>> don't dedicate all their time to Spark. In fact, it is in some ways similar
>>> to scaling an engineering team in a successful startup: old processes that
>>> worked well might not work so well when it gets to a certain size, cultures
>>> can get diluted, building culture vs building process, etc.
>>> 
>>> I also really like to have a more visible process for larger changes,
>>> especially major user facing API changes. Historically we upload design docs
>>> for major changes, but it is not always consistent and difficult to quality
>>> of the docs, due to the volunteering nature of the organization.
>>> 
>>> Some of the more concrete ideas we discussed focus on building a culture
>>> to improve clarity:
>>> 
>>> - Process: Large changes should have design docs posted on JIRA. One thing
>>> Cody and I didn't discuss but an idea that just came to me is we should
>>> create a design doc template for the project and ask everybody to follow.
>>> The design doc template should also explicitly list goals and non-goals, to
>>> make design doc more consistent.
>>> 
>>> - Process: Email dev@ to solicit feedback. We have some this with some
>>> changes, but again very inconsistent. Just posting something on JIRA isn't
>>> sufficient, because there are simply too many JIRAs and the signal get lost
>>> in the noise. While this is generally impossible to enforce because we can't
>>> force all volunteers to conform to a process (or they might not even be
>>> aware of this),  those who are more familiar with the project can help by
>>> emailing the dev@ when they see something that hasn't been.
>>> 
>>> - Culture: The design doc author(s) should be open to feedback. A design
>>> doc should serve as the base fo

Re: Improving governance / committers (split from Spark Improvement Proposals thread)

2016-10-08 Thread Matei Zaharia
This makes a lot of sense; just to comment on a few things:

> - More committers
> Just looking at the ratio of committers to open tickets, or committers
> to contributors, I don't think you have enough human power.
> I realize this is a touchy issue.  I don't have dog in this fight,
> because I'm not on either coast nor in a big company that views
> committership as a political thing.  I just think you need more people
> to do the work, and more diversity of viewpoint.
> It's unfortunate that the Apache governance process involves giving
> someone all the keys or none of the keys, but until someone really
> starts screwing up, I think it's better to err on the side of
> accepting hard-working people.

This is something the PMC is actively discussing. Historically, we've added 
committers when people contributed a new module or feature, basically to the 
point where other developers are asking them to review changes in that area 
(https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-BecomingaCommitter).
 For example, we added the original authors of GraphX when we merged in GraphX, 
the authors of new ML algorithms, etc. However, there's a good argument that 
some areas are simply not covered well now and we should add people there. 
Also, as the project has grown, there are also more people who focus on smaller 
fixes and are nonetheless contributing a lot.

> - Each major area of the code needs at least one person who cares
> about it that is empowered with a vote, otherwise decisions get made
> that don't make technical sense.
> I don't know if anyone with a vote is shepherding GraphX (or maybe
> it's just dead), the Mesos relationship has always been weird, no one
> with a vote really groks Kafka.
> marmbrus and zsxwing are getting there quickly on the Kafka side, and
> I appreciate it, but it's been bad for a while.
> Because I don't have any political power, my response to seeing things
> that I know are technically dangerous has been to yell really loud
> until someone listens, which sucks for everyone involved.
> I already apologized to Michael privately; Ryan, I'm sorry, it's not about 
> you.
> This seems pretty straightforward to fix, if politically awkward:
> those people exist, just give them a vote.
> Failing that, listen the first or second time they say something not
> the third or fourth, and if it doesn't make sense, ask.

Just as a note here -- it's true that some areas are not super well covered, 
but I also hope to avoid a situation where people have to yell to be listened 
to. I can't say anything about *all* technical discussions we've ever had, but 
historically, people have been able to comment on the design of many things 
without yelling. This is actually important because a culture of having to yell 
can drive away contributors. So it's awesome that you yelled about the Kafka 
source stuff, but at the same time, hopefully we make these types of things 
work without yelling. This would be a problem even if there were committers 
with more expertise in each area -- what if someone disagrees with the 
committers?

Matei


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-08 Thread Matei Zaharia
I like this idea of asking them. BTW, one other thing we can do *provided the 
JIRAs are eventually under control* is to create a filter for old JIRAs that 
have not received a response in X amount of time and have the system 
automatically email the dev list with this report every month. Then everyone 
can see the list of items and maybe be reminded to take care to clean it up. 
This only works if the list is manageable and you actually want to read all of 
it.

Matei

> On Oct 8, 2016, at 9:01 AM, Cody Koeninger  wrote:
> 
> Yeah, I've interacted with other projects that used that system and it was 
> pleasant.
> 
> 1. "this is getting closed cause its stale, let us know if thats a problem"
> 2. "actually that matters to us"
> 3. "ok well leave it open"
> 
> I'd be fine with totally automating step 1 as long as a human was involved at 
> step 2 and 3
> 
> 
> On Saturday, October 8, 2016, assaf.mendelson  > wrote:
> I don’t really have much experience with large open source projects but I 
> have some experience with having lots of issues with no one handling them. 
> Automation proved a good solution in my experience, but one thing that I 
> found which was really important is giving people a chance to say “don’t 
> close this please”.
> 
> Basically, because closing you can send an email to the reporter (and 
> probably people who are watching the issue) and tell them this is going to be 
> closed. Allow them an option to ping back saying “don’t close this please” 
> which would ping committers for input (as if there were 5+ votes as described 
> by Nick).
> 
> The main reason for this is that many times people fine solutions and the 
> issue does become stale but at other times, the issue is still important, it 
> is just that no one noticed it because of the noise of other issues.
> 
> Thanks,
> 
> Assaf.
> 
>  
> 
>  
> 
>  
> 
> From: Nicholas Chammas [via Apache Spark Developers List] [mailto:ml-node+ 
> [hidden email] 
> ] 
> Sent: Saturday, October 08, 2016 12:42 AM
> To: Mendelson, Assaf
> Subject: Re: Improving volunteer management / JIRAs (split from Spark 
> Improvement Proposals thread)
> 
>  
> 
> I agree with Cody and others that we need some automation — or at least an 
> adjusted process — to help us manage organic contributions better.
> 
> The objections about automated closing being potentially abrasive are 
> understood, but I wouldn’t accept that as a defeat for automation. Instead, 
> it seems like a constraint we should impose on any proposed solution: Make 
> sure it doesn’t turn contributors off. Rolling as we have been won’t cut it, 
> and I don’t think adding committers will ever be a sufficient solution to 
> this particular problem.
> 
> To me, it seems like we need a way to filter out viable contributions with 
> community support from other contributions when it comes to deciding that 
> automated action is appropriate. Our current tooling isn’t perfect, but 
> perhaps we can leverage it to create such a filter.
> 
> For example, consider the following strawman proposal for how to cut down on 
> the number of pending but unviable proposals, and simultaneously help 
> contributors organize to promote viable proposals and get the attention of 
> committers:
> 
> 1.  Have a bot scan for stale JIRA issues and PRs—i.e. they haven’t been 
> updated in 20+ days (or D+ days, if you prefer).
> 
> 2.  Depending on the level of community support, either close the item or 
> ping specific people for action. Specifically:
> a. If the JIRA/PR has no input from a committer and the JIRA/PR has 5+ votes 
> (or V+ votes), ping committers for input. (For PRs, you could count comments 
> from different people, or thumbs up on the initial PR post.)
> b. If the JIRA/PR has no input from a committer and the JIRA/PR has less than 
> V votes, close it with a gentle message asking the contributor to solicit 
> support from either the community or a committer, and try again later.
> c. If the JIRA/PR has input from a committer or committers, ping them for an 
> update.
> 
> This is just a rough idea. The point is that when contributors have stale 
> proposals that they don’t close, committers need to take action. A little 
> automation to selectively bring contributions to the attention of committers 
> can perhaps help them manage the backlog of stale contributions. The 
> “selective” part is implemented in this strawman proposal by using JIRA votes 
> as a crude proxy for when the community is interested in something, but it 
> could be anything.
> 
> Also, this doesn’t have to be used just to clear out stale proposals. Once 
> the initial backlog is trimmed down, you could set D to 5 days and use this 
> as a regular way to bring contributions to the attention of committers.
> 
> I dunno if people think this is perhaps too complex, but at our scale I feel 
> we need some kind of loose but a

Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Cody Koeninger
Cool, I'll start going through stuff as I have time.  Already closed
one, if anyone sees a problem let me know.

Still think it would be nice to have some way to make it obvious to
the people who have the will and knowledge to do it that it's ok for
them to do it :)

On Sat, Oct 8, 2016 at 2:19 PM, Reynold Xin  wrote:
> I think so (at least I think it is socially acceptable). Of course, use good
> judgement here :)
>
>
>
> On Sat, Oct 8, 2016 at 12:06 PM, Cody Koeninger  wrote:
>>
>> So to be clear, can I go clean up the Kafka cruft?
>>
>> On Sat, Oct 8, 2016 at 1:41 PM, Reynold Xin  wrote:
>> >
>> > On Sat, Oct 8, 2016 at 2:09 AM, Sean Owen  wrote:
>> >>
>> >>
>> >> - Resolve as Fixed if there's a change you can point to that resolved
>> >> the
>> >> issue
>> >> - If the issue is a proper subset of another issue, mark it a Duplicate
>> >> of
>> >> that issue (rather than the other way around)
>> >> - If it's probably resolved, but not obvious what fixed it or when,
>> >> then
>> >> Cannot Reproduce or Not a Problem
>> >> - Obsolete issue? Not a Problem
>> >> - If it's a coherent issue but does not seem like there is support or
>> >> interest in acting on it, then Won't Fix
>> >> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
>> >> - I tend to mark Umbrellas as "Done" when done if they're just
>> >> containers
>> >> - Try to set Fix version
>> >> - Try to set Assignee to the person who most contributed to the
>> >> resolution. Usually the person who opened the PR. Strong preference for
>> >> ties
>> >> going to the more 'junior' contributor
>> >
>> >
>> > +1
>> >
>> > This is consistent with my understanding. It would be good to document
>> > these
>> > on JIRA. And I second "The only ones I think are sort of important are
>> > getting the Duplicate pointers right, and possibly making sure that
>> > Fixed
>> > issues have a clear path to finding what change fixed it and when. The
>> > rest
>> > doesn't matter much."
>> >
>> > I also think it is a good idea to give people rights to close tickets to
>> > help with JIRA maintenance. We can always revoke that if we see a
>> > malicious
>> > actor (or somebody with extremely bad judgement), but we are pretty far
>> > away
>> > from that right now.
>> >
>> >
>> >
>
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Reynold Xin
I think so (at least I think it is socially acceptable). Of course, use
good judgement here :)



On Sat, Oct 8, 2016 at 12:06 PM, Cody Koeninger  wrote:

> So to be clear, can I go clean up the Kafka cruft?
>
> On Sat, Oct 8, 2016 at 1:41 PM, Reynold Xin  wrote:
> >
> > On Sat, Oct 8, 2016 at 2:09 AM, Sean Owen  wrote:
> >>
> >>
> >> - Resolve as Fixed if there's a change you can point to that resolved
> the
> >> issue
> >> - If the issue is a proper subset of another issue, mark it a Duplicate
> of
> >> that issue (rather than the other way around)
> >> - If it's probably resolved, but not obvious what fixed it or when, then
> >> Cannot Reproduce or Not a Problem
> >> - Obsolete issue? Not a Problem
> >> - If it's a coherent issue but does not seem like there is support or
> >> interest in acting on it, then Won't Fix
> >> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
> >> - I tend to mark Umbrellas as "Done" when done if they're just
> containers
> >> - Try to set Fix version
> >> - Try to set Assignee to the person who most contributed to the
> >> resolution. Usually the person who opened the PR. Strong preference for
> ties
> >> going to the more 'junior' contributor
> >
> >
> > +1
> >
> > This is consistent with my understanding. It would be good to document
> these
> > on JIRA. And I second "The only ones I think are sort of important are
> > getting the Duplicate pointers right, and possibly making sure that Fixed
> > issues have a clear path to finding what change fixed it and when. The
> rest
> > doesn't matter much."
> >
> > I also think it is a good idea to give people rights to close tickets to
> > help with JIRA maintenance. We can always revoke that if we see a
> malicious
> > actor (or somebody with extremely bad judgement), but we are pretty far
> away
> > from that right now.
> >
> >
> >
>


Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Cody Koeninger
So to be clear, can I go clean up the Kafka cruft?

On Sat, Oct 8, 2016 at 1:41 PM, Reynold Xin  wrote:
>
> On Sat, Oct 8, 2016 at 2:09 AM, Sean Owen  wrote:
>>
>>
>> - Resolve as Fixed if there's a change you can point to that resolved the
>> issue
>> - If the issue is a proper subset of another issue, mark it a Duplicate of
>> that issue (rather than the other way around)
>> - If it's probably resolved, but not obvious what fixed it or when, then
>> Cannot Reproduce or Not a Problem
>> - Obsolete issue? Not a Problem
>> - If it's a coherent issue but does not seem like there is support or
>> interest in acting on it, then Won't Fix
>> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
>> - I tend to mark Umbrellas as "Done" when done if they're just containers
>> - Try to set Fix version
>> - Try to set Assignee to the person who most contributed to the
>> resolution. Usually the person who opened the PR. Strong preference for ties
>> going to the more 'junior' contributor
>
>
> +1
>
> This is consistent with my understanding. It would be good to document these
> on JIRA. And I second "The only ones I think are sort of important are
> getting the Duplicate pointers right, and possibly making sure that Fixed
> issues have a clear path to finding what change fixed it and when. The rest
> doesn't matter much."
>
> I also think it is a good idea to give people rights to close tickets to
> help with JIRA maintenance. We can always revoke that if we see a malicious
> actor (or somebody with extremely bad judgement), but we are pretty far away
> from that right now.
>
>
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Reynold Xin
On Sat, Oct 8, 2016 at 2:09 AM, Sean Owen  wrote:

>
> - Resolve as Fixed if there's a change you can point to that resolved the
> issue
> - If the issue is a proper subset of another issue, mark it a Duplicate of
> that issue (rather than the other way around)
> - If it's probably resolved, but not obvious what fixed it or when, then
> Cannot Reproduce or Not a Problem
> - Obsolete issue? Not a Problem
> - If it's a coherent issue but does not seem like there is support or
> interest in acting on it, then Won't Fix
> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
> - I tend to mark Umbrellas as "Done" when done if they're just containers
> - Try to set Fix version
> - Try to set Assignee to the person who most contributed to the
> resolution. Usually the person who opened the PR. Strong preference for
> ties going to the more 'junior' contributor
>

+1

This is consistent with my understanding. It would be good to document
these on JIRA. And I second "The only ones I think are sort of important
are getting the Duplicate pointers right, and possibly making sure that
Fixed issues have a clear path to finding what change fixed it and when.
The rest doesn't matter much."

I also think it is a good idea to give people rights to close tickets to
help with JIRA maintenance. We can always revoke that if we see a malicious
actor (or somebody with extremely bad judgement), but we are pretty far
away from that right now.


Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Xiao Li
Thank you, Sean Owen! Your guideline looks pretty good. I will try to
follow it when closing the JIRAs. Please correct me if you found
anything is not appropriate.

Actually, the unresolved SQL JIRAs has almost 1000. Compared with the
other components, I think SQL might be the biggest one. I will try to
go over all of them at my best. Then, maybe the other Committers and
contributors can go over the still-open JIRAs again and see whether
some of them should be closed or resolved in the next releases.

In the future, personally, I will try to do it whenever I am free.
Hopefully, it can help the community.

BTW, since I started joining the Spark community, I already realized
Sean Owen, Reynold Xin, Yin Huai are doing it on a regular basis. I
believe most appreciate their contribution to the community. Here, I
just want to say thank you to them: THANK YOU!

Cheers,

Xiao

2016-10-08 10:56 GMT-07:00 Felix Cheung :
> +1 on this proposal and everyone can contribute to updates and discussions
> on JIRAs
>
> Will be great if this could be put on the Spark wiki.
>
>
>
>
>
> On Sat, Oct 8, 2016 at 9:05 AM -0700, "Ted Yu"  wrote:
>
> Makes sense.
>
> I trust Hyukjin, Holden and Cody's judgement in respective areas.
>
> I just wish to see more participation from the committers.
>
> Thanks
>
>> On Oct 8, 2016, at 8:27 AM, Sean Owen  wrote:
>>
>> Hyukjin
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Felix Cheung
+1 on this proposal and everyone can contribute to updates and discussions on 
JIRAs

Will be great if this could be put on the Spark wiki.





On Sat, Oct 8, 2016 at 9:05 AM -0700, "Ted Yu" 
mailto:yuzhih...@gmail.com>> wrote:

Makes sense.

I trust Hyukjin, Holden and Cody's judgement in respective areas.

I just wish to see more participation from the committers.

Thanks

> On Oct 8, 2016, at 8:27 AM, Sean Owen  wrote:
>
> Hyukjin

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Ted Yu
Makes sense. 

I trust Hyukjin, Holden and Cody's judgement in respective areas. 

I just wish to see more participation from the committers. 

Thanks 

> On Oct 8, 2016, at 8:27 AM, Sean Owen  wrote:
> 
> Hyukjin

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-08 Thread Cody Koeninger
Yeah, I've interacted with other projects that used that system and it was
pleasant.

1. "this is getting closed cause its stale, let us know if thats a problem"
2. "actually that matters to us"
3. "ok well leave it open"

I'd be fine with totally automating step 1 as long as a human was involved
at step 2 and 3


On Saturday, October 8, 2016, assaf.mendelson 
wrote:

> I don’t really have much experience with large open source projects but I
> have some experience with having lots of issues with no one handling them.
> Automation proved a good solution in my experience, but one thing that I
> found which was really important is giving people a chance to say “don’t
> close this please”.
>
> Basically, because closing you can send an email to the reporter (and
> probably people who are watching the issue) and tell them this is going to
> be closed. Allow them an option to ping back saying “don’t close this
> please” which would ping committers for input (as if there were 5+ votes as
> described by Nick).
>
> The main reason for this is that many times people fine solutions and the
> issue does become stale but at other times, the issue is still important,
> it is just that no one noticed it because of the noise of other issues.
>
> Thanks,
>
> Assaf.
>
>
>
>
>
>
>
> *From:* Nicholas Chammas [via Apache Spark Developers List] [mailto:
> ml-node+ [hidden email]
> ]
> *Sent:* Saturday, October 08, 2016 12:42 AM
> *To:* Mendelson, Assaf
> *Subject:* Re: Improving volunteer management / JIRAs (split from Spark
> Improvement Proposals thread)
>
>
>
> I agree with Cody and others that we need some automation — or at least an
> adjusted process — to help us manage organic contributions better.
>
> The objections about automated closing being potentially abrasive are
> understood, but I wouldn’t accept that as a defeat for automation. Instead,
> it seems like a constraint we should impose on any proposed solution: Make
> sure it doesn’t turn contributors off. Rolling as we have been won’t cut
> it, and I don’t think adding committers will ever be a sufficient solution
> to this particular problem.
>
> To me, it seems like we need a way to filter out viable contributions with
> community support from other contributions when it comes to deciding that
> automated action is appropriate. Our current tooling isn’t perfect, but
> perhaps we can leverage it to create such a filter.
>
> For example, consider the following strawman proposal for how to cut down
> on the number of pending but unviable proposals, and simultaneously help
> contributors organize to promote viable proposals and get the attention of
> committers:
>
> 1.  Have a bot scan for *stale* JIRA issues and PRs—i.e. they haven’t
> been updated in 20+ days (or D+ days, if you prefer).
>
> 2.  Depending on the level of community support, either close the
> item or ping specific people for action. Specifically:
> a. If the JIRA/PR has no input from a committer and the JIRA/PR has 5+
> votes (or V+ votes), ping committers for input. (For PRs, you could count
> comments from different people, or thumbs up on the initial PR post.)
> b. If the JIRA/PR has no input from a committer and the JIRA/PR has less
> than V votes, close it with a gentle message asking the contributor to
> solicit support from either the community or a committer, and try again
> later.
> c. If the JIRA/PR has input from a committer or committers, ping them for
> an update.
>
> This is just a rough idea. The point is that when contributors have stale
> proposals that they don’t close, committers need to take action. A little
> automation to selectively bring contributions to the attention of
> committers can perhaps help them manage the backlog of stale contributions.
> The “selective” part is implemented in this strawman proposal by using JIRA
> votes as a crude proxy for when the community is interested in something,
> but it could be anything.
>
> Also, this doesn’t have to be used just to clear out stale proposals. Once
> the initial backlog is trimmed down, you could set D to 5 days and use
> this as a regular way to bring contributions to the attention of committers.
>
> I dunno if people think this is perhaps too complex, but at our scale I
> feel we need some kind of loose but automated system for funneling
> contributions through some kind of lifecycle. The status quo is just not
> that good (e.g. 474 open PRs 
> against Spark as of this moment).
>
> Nick
>
> ​
>
>
>
> On Fri, Oct 7, 2016 at 4:48 PM Cody Koeninger <[hidden email]
> > wrote:
>
> Matei asked:
>
>
> > I agree about empowering people interested here to contribute, but I'm
> wondering, do you think there are technical things that people don't want
> to work on, or is it a matter of what there's been time to do?
>
>
> It's a matter of mismanagement and misco

Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Sean Owen
I know what you mean Ted, and I think actually what's happening here is
fine, but I'll explain my theory.

There are a range of actions in the project where someone makes a decision,
from answering an email to creating a release. We already accept that only
some of these require a formal status; anyone can answer emails, but only
the PMC can bless a release, for example.

The reason commits and releases require a certain status is not _entirely_
to block most people from participating in these activities. It's in part
because things the ASF's liability protections for releases depend on the
existence of well-defined governance models, that wouldn't quite be
compatible with anyone adding software at will.

Issue management isn't in this category, and, of course, we let anyone make
JIRAs and PRs. This causes problems occasionally but is on the whole
powerfully good. So, it seems reasonable to let people close JIRAs if, in
good faith, they have clear reason to do so. These things are reversible,
too. I also think there's a cost to not getting this triage work done, just
as there would be a cost to blocking people from creating issues.

I've reviewed the cleanup in the past 24 hours and agree with virtually
every action, so I have confidence that in practice this is a big positive.

That said, I have suggested in the past that perhaps only committers should
set "Blocker" and "Target Version", because this communicates something
specific about what will be committed and in what release, and acting on
those requires commit access. Although by the theory above we should let
anyone set these -- and actually, we do -- I have found it usually set
incorrectly, and so, argue that these fields should be treated differently
as a matter of convention.

Sean

On Sat, Oct 8, 2016 at 3:54 PM Holden Karau  wrote:

> We could certainly do that system - but given the current somewhat small
> set of active committers its clearly not scaling very well. There are many
> developers  in Spark like Hyukjin, Cody, and myself who care about specific
> areas and can verify if an issue is still present in mainline.
>
> That being said if the general view is that only committers should resolve
> JIRAs I'm happy to back off and leave that to the current committers (or we
> could try ping them to close issues which I think are resolved instead of
> closing them myself but given how many pings I sometimes have to make to
> get an issue looked at I'm hesitant to suggest this system).
>
> I'll hold off on my JIRA review for a bit while we get this sorted :)
>
> On Sat, Oct 8, 2016 at 7:47 AM, Ted Yu  wrote:
>
> I think only committers should resolve JIRAs which were not created by
> himself / herself.
>
>
>


RE: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-08 Thread assaf.mendelson
I don’t really have much experience with large open source projects but I have 
some experience with having lots of issues with no one handling them. 
Automation proved a good solution in my experience, but one thing that I found 
which was really important is giving people a chance to say “don’t close this 
please”.
Basically, because closing you can send an email to the reporter (and probably 
people who are watching the issue) and tell them this is going to be closed. 
Allow them an option to ping back saying “don’t close this please” which would 
ping committers for input (as if there were 5+ votes as described by Nick).
The main reason for this is that many times people fine solutions and the issue 
does become stale but at other times, the issue is still important, it is just 
that no one noticed it because of the noise of other issues.
Thanks,
Assaf.



From: Nicholas Chammas [via Apache Spark Developers List] 
[mailto:ml-node+s1001551n19310...@n3.nabble.com]
Sent: Saturday, October 08, 2016 12:42 AM
To: Mendelson, Assaf
Subject: Re: Improving volunteer management / JIRAs (split from Spark 
Improvement Proposals thread)


I agree with Cody and others that we need some automation — or at least an 
adjusted process — to help us manage organic contributions better.

The objections about automated closing being potentially abrasive are 
understood, but I wouldn’t accept that as a defeat for automation. Instead, it 
seems like a constraint we should impose on any proposed solution: Make sure it 
doesn’t turn contributors off. Rolling as we have been won’t cut it, and I 
don’t think adding committers will ever be a sufficient solution to this 
particular problem.

To me, it seems like we need a way to filter out viable contributions with 
community support from other contributions when it comes to deciding that 
automated action is appropriate. Our current tooling isn’t perfect, but perhaps 
we can leverage it to create such a filter.

For example, consider the following strawman proposal for how to cut down on 
the number of pending but unviable proposals, and simultaneously help 
contributors organize to promote viable proposals and get the attention of 
committers:
1.  Have a bot scan for stale JIRA issues and PRs—i.e. they haven’t been 
updated in 20+ days (or D+ days, if you prefer).
2.  Depending on the level of community support, either close the item or 
ping specific people for action. Specifically:
a. If the JIRA/PR has no input from a committer and the JIRA/PR has 5+ votes 
(or V+ votes), ping committers for input. (For PRs, you could count comments 
from different people, or thumbs up on the initial PR post.)
b. If the JIRA/PR has no input from a committer and the JIRA/PR has less than V 
votes, close it with a gentle message asking the contributor to solicit support 
from either the community or a committer, and try again later.
c. If the JIRA/PR has input from a committer or committers, ping them for an 
update.

This is just a rough idea. The point is that when contributors have stale 
proposals that they don’t close, committers need to take action. A little 
automation to selectively bring contributions to the attention of committers 
can perhaps help them manage the backlog of stale contributions. The 
“selective” part is implemented in this strawman proposal by using JIRA votes 
as a crude proxy for when the community is interested in something, but it 
could be anything.

Also, this doesn’t have to be used just to clear out stale proposals. Once the 
initial backlog is trimmed down, you could set D to 5 days and use this as a 
regular way to bring contributions to the attention of committers.

I dunno if people think this is perhaps too complex, but at our scale I feel we 
need some kind of loose but automated system for funneling contributions 
through some kind of lifecycle. The status quo is just not that good (e.g. 474 
open PRs against Spark as of this 
moment).

Nick
​

On Fri, Oct 7, 2016 at 4:48 PM Cody Koeninger <[hidden 
email]> wrote:
Matei asked:


> I agree about empowering people interested here to contribute, but I'm 
> wondering, do you think there are technical things that people don't want to 
> work on, or is it a matter of what there's been time to do?


It's a matter of mismanagement and miscommunication.

The structured streaming kafka jira sat with multiple unanswered
requests for someone who was a committer to communicate whether they
were working on it and what the plan was.  I could have done that
implementation and had it in users' hands months ago.  I didn't
pre-emptively do it because I didn't want to then have to argue with
committers about why my code did or did not meet their uncommunicated
expectations.


I don't want to re-hash that particular circumstance, I just want to
make sure it never happens again.


Hopefully the SIP thread results in clearer expectations, but there
are still some id

Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Holden Karau
We could certainly do that system - but given the current somewhat small
set of active committers its clearly not scaling very well. There are many
developers  in Spark like Hyukjin, Cody, and myself who care about specific
areas and can verify if an issue is still present in mainline.

That being said if the general view is that only committers should resolve
JIRAs I'm happy to back off and leave that to the current committers (or we
could try ping them to close issues which I think are resolved instead of
closing them myself but given how many pings I sometimes have to make to
get an issue looked at I'm hesitant to suggest this system).

I'll hold off on my JIRA review for a bit while we get this sorted :)

On Sat, Oct 8, 2016 at 7:47 AM, Ted Yu  wrote:

> I think only committers should resolve JIRAs which were not created by
> himself / herself.
>
> On Oct 8, 2016, at 6:53 AM, Hyukjin Kwon  wrote:
>
> I am uncertain too. It'd be great if these are documented too.
>
> FWIW, in my case, I privately asked and told Sean first that I am going to
> look though the JIRAs
> and resolve some via the suggested conventions from Sean.
> (Definitely all blames should be on me if I have done something terribly
> wrong).
>
>
>
> 2016-10-08 22:37 GMT+09:00 Cody Koeninger :
>
>> That makes sense, thanks.
>>
>> One thing I've never been clear on is who should be allowed to resolve
>> Jiras.  Can I go clean up the backlog of Kafka Jiras that weren't created
>> by me?
>>
>> If there's an informal policy here, can we update the wiki to reflect
>> it?  Maybe it's there already, but I didn't see it last time I looked.
>>
>> On Oct 8, 2016 4:10 AM, "Sean Owen"  wrote:
>>
>> That flood of emails means several people (Xiao, Holden mostly AFAICT)
>> have been updating the status of old JIRAs. Thank you, I think that really
>> does help.
>>
>> I have a suggested set of conventions I've been using, just to bring some
>> order to the resolutions. It helps because JIRA functions as a huge archive
>> of decisions and the more accurately we can record that the better. What do
>> people think of this?
>>
>> - Resolve as Fixed if there's a change you can point to that resolved the
>> issue
>> - If the issue is a proper subset of another issue, mark it a Duplicate
>> of that issue (rather than the other way around)
>> - If it's probably resolved, but not obvious what fixed it or when, then
>> Cannot Reproduce or Not a Problem
>> - Obsolete issue? Not a Problem
>> - If it's a coherent issue but does not seem like there is support or
>> interest in acting on it, then Won't Fix
>> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
>> - I tend to mark Umbrellas as "Done" when done if they're just containers
>> - Try to set Fix version
>> - Try to set Assignee to the person who most contributed to the
>> resolution. Usually the person who opened the PR. Strong preference for
>> ties going to the more 'junior' contributor
>>
>> The only ones I think are sort of important are getting the Duplicate
>> pointers right, and possibly making sure that Fixed issues have a clear
>> path to finding what change fixed it and when. The rest doesn't matter much.
>>
>>
>>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau


Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Ted Yu
I think only committers should resolve JIRAs which were not created by himself 
/ herself. 

> On Oct 8, 2016, at 6:53 AM, Hyukjin Kwon  wrote:
> 
> I am uncertain too. It'd be great if these are documented too.
> 
> FWIW, in my case, I privately asked and told Sean first that I am going to 
> look though the JIRAs
> and resolve some via the suggested conventions from Sean.
> (Definitely all blames should be on me if I have done something terribly 
> wrong). 
> 
> 
> 
> 2016-10-08 22:37 GMT+09:00 Cody Koeninger :
>> That makes sense, thanks.
>> 
>> One thing I've never been clear on is who should be allowed to resolve 
>> Jiras.  Can I go clean up the backlog of Kafka Jiras that weren't created by 
>> me?
>> 
>> If there's an informal policy here, can we update the wiki to reflect it?  
>> Maybe it's there already, but I didn't see it last time I looked.
>> 
>> 
>> On Oct 8, 2016 4:10 AM, "Sean Owen"  wrote:
>> That flood of emails means several people (Xiao, Holden mostly AFAICT) have 
>> been updating the status of old JIRAs. Thank you, I think that really does 
>> help. 
>> 
>> I have a suggested set of conventions I've been using, just to bring some 
>> order to the resolutions. It helps because JIRA functions as a huge archive 
>> of decisions and the more accurately we can record that the better. What do 
>> people think of this?
>> 
>> - Resolve as Fixed if there's a change you can point to that resolved the 
>> issue
>> - If the issue is a proper subset of another issue, mark it a Duplicate of 
>> that issue (rather than the other way around)
>> - If it's probably resolved, but not obvious what fixed it or when, then 
>> Cannot Reproduce or Not a Problem
>> - Obsolete issue? Not a Problem
>> - If it's a coherent issue but does not seem like there is support or 
>> interest in acting on it, then Won't Fix
>> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
>> - I tend to mark Umbrellas as "Done" when done if they're just containers
>> - Try to set Fix version
>> - Try to set Assignee to the person who most contributed to the resolution. 
>> Usually the person who opened the PR. Strong preference for ties going to 
>> the more 'junior' contributor
>> 
>> The only ones I think are sort of important are getting the Duplicate 
>> pointers right, and possibly making sure that Fixed issues have a clear path 
>> to finding what change fixed it and when. The rest doesn't matter much.
> 


Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Hyukjin Kwon
I am uncertain too. It'd be great if these are documented too.

FWIW, in my case, I privately asked and told Sean first that I am going to
look though the JIRAs
and resolve some via the suggested conventions from Sean.
(Definitely all blames should be on me if I have done something terribly
wrong).



2016-10-08 22:37 GMT+09:00 Cody Koeninger :

> That makes sense, thanks.
>
> One thing I've never been clear on is who should be allowed to resolve
> Jiras.  Can I go clean up the backlog of Kafka Jiras that weren't created
> by me?
>
> If there's an informal policy here, can we update the wiki to reflect it?
> Maybe it's there already, but I didn't see it last time I looked.
>
> On Oct 8, 2016 4:10 AM, "Sean Owen"  wrote:
>
> That flood of emails means several people (Xiao, Holden mostly AFAICT)
> have been updating the status of old JIRAs. Thank you, I think that really
> does help.
>
> I have a suggested set of conventions I've been using, just to bring some
> order to the resolutions. It helps because JIRA functions as a huge archive
> of decisions and the more accurately we can record that the better. What do
> people think of this?
>
> - Resolve as Fixed if there's a change you can point to that resolved the
> issue
> - If the issue is a proper subset of another issue, mark it a Duplicate of
> that issue (rather than the other way around)
> - If it's probably resolved, but not obvious what fixed it or when, then
> Cannot Reproduce or Not a Problem
> - Obsolete issue? Not a Problem
> - If it's a coherent issue but does not seem like there is support or
> interest in acting on it, then Won't Fix
> - If the issue doesn't make sense (non-Spark issue, etc) then Invalid
> - I tend to mark Umbrellas as "Done" when done if they're just containers
> - Try to set Fix version
> - Try to set Assignee to the person who most contributed to the
> resolution. Usually the person who opened the PR. Strong preference for
> ties going to the more 'junior' contributor
>
> The only ones I think are sort of important are getting the Duplicate
> pointers right, and possibly making sure that Fixed issues have a clear
> path to finding what change fixed it and when. The rest doesn't matter much.
>
>
>


Re: PSA: JIRA resolutions and meanings

2016-10-08 Thread Cody Koeninger
That makes sense, thanks.

One thing I've never been clear on is who should be allowed to resolve
Jiras.  Can I go clean up the backlog of Kafka Jiras that weren't created
by me?

If there's an informal policy here, can we update the wiki to reflect it?
Maybe it's there already, but I didn't see it last time I looked.

On Oct 8, 2016 4:10 AM, "Sean Owen"  wrote:

That flood of emails means several people (Xiao, Holden mostly AFAICT) have
been updating the status of old JIRAs. Thank you, I think that really does
help.

I have a suggested set of conventions I've been using, just to bring some
order to the resolutions. It helps because JIRA functions as a huge archive
of decisions and the more accurately we can record that the better. What do
people think of this?

- Resolve as Fixed if there's a change you can point to that resolved the
issue
- If the issue is a proper subset of another issue, mark it a Duplicate of
that issue (rather than the other way around)
- If it's probably resolved, but not obvious what fixed it or when, then
Cannot Reproduce or Not a Problem
- Obsolete issue? Not a Problem
- If it's a coherent issue but does not seem like there is support or
interest in acting on it, then Won't Fix
- If the issue doesn't make sense (non-Spark issue, etc) then Invalid
- I tend to mark Umbrellas as "Done" when done if they're just containers
- Try to set Fix version
- Try to set Assignee to the person who most contributed to the resolution.
Usually the person who opened the PR. Strong preference for ties going to
the more 'junior' contributor

The only ones I think are sort of important are getting the Duplicate
pointers right, and possibly making sure that Fixed issues have a clear
path to finding what change fixed it and when. The rest doesn't matter much.


PSA: JIRA resolutions and meanings

2016-10-08 Thread Sean Owen
That flood of emails means several people (Xiao, Holden mostly AFAICT) have
been updating the status of old JIRAs. Thank you, I think that really does
help.

I have a suggested set of conventions I've been using, just to bring some
order to the resolutions. It helps because JIRA functions as a huge archive
of decisions and the more accurately we can record that the better. What do
people think of this?

- Resolve as Fixed if there's a change you can point to that resolved the
issue
- If the issue is a proper subset of another issue, mark it a Duplicate of
that issue (rather than the other way around)
- If it's probably resolved, but not obvious what fixed it or when, then
Cannot Reproduce or Not a Problem
- Obsolete issue? Not a Problem
- If it's a coherent issue but does not seem like there is support or
interest in acting on it, then Won't Fix
- If the issue doesn't make sense (non-Spark issue, etc) then Invalid
- I tend to mark Umbrellas as "Done" when done if they're just containers
- Try to set Fix version
- Try to set Assignee to the person who most contributed to the resolution.
Usually the person who opened the PR. Strong preference for ties going to
the more 'junior' contributor

The only ones I think are sort of important are getting the Duplicate
pointers right, and possibly making sure that Fixed issues have a clear
path to finding what change fixed it and when. The rest doesn't matter much.