Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli
Actually what this community got away with is pretty much an anti-pattern 
compared to every other Apache project I have seen. And may I say in a not so 
Apache way.

Waiting for a committer to assign a patch to someone leaves it as a privilege 
to a committer. Not alluding to anything fishy in practice, but this also 
leaves a lot of open ground for self-interest. Committers defining notions of 
good fit / level of experience do not work, highly subjective and lead to group 
control.

In terms of semantics, here is what most other projects (dare I say every 
Apache project?) that I have seen do
 - A new contributor comes in who is not yet added to the JIRA project. He/she 
requests one of the project's JIRA admins to add him/her.
 - After that, he or she is free to assign tickets to themselves.
 - What this means
-- Assigning a ticket to oneself is a signal to the rest of the community 
that he/she is actively working on the said patch.
-- If multiple contributors want to work on the same patch, it needs to 
resolved amicably through open communication. On JIRA, or on mailing lists. Not 
by the whim of a committer.
 - Common issues
-- Land grabbing: Other contributors can nudge him/her in case of 
inactivity and take them over. Again, amicably instead of a committer making 
subjective decisions.
-- Progress stalling: One contributor assigns the ticket to himself/herself 
is actively debating but with no real code/docs contribution or with any real 
intention of making progress. Here workable, reviewable code for review usually 
wins.

Assigning patches is not a privilege. Contributors at Apache are a bunch of 
volunteers, the PMC should let volunteers contribute as they see fit. We do not 
assign work at Apache.

+Vinod

On Apr 22, 2015, at 12:32 PM, Patrick Wendell  wrote:

> One over arching issue is that it's pretty unclear what "Assigned to
> X" in JIAR means from a process perspective. Personally I actually
> feel it's better for this to be more historical - i.e. who ended up
> submitting a patch for this feature that was merged - rather than
> creating an exclusive reservation for a particular user to work on
> something.
> 
> If an issue is "assigned" to person X, but some other person Y submits
> a great patch for it, I think we have some obligation to Spark users
> and to the community to merge the better patch. So the idea of
> reserving the right to add a feature, it just seems overall off to me.
> IMO, its fine if multiple people want to submit competing patches for
> something, provided everyone comments on JIRA saying they are
> intending to submit a patch, and everyone understands there is
> duplicate effort. So commenting with an intention to submit a patch,
> IMO seems like the healthiest workflow since it is non exclusive.
> 
> To me the main benefit of "assigning" something ahead of time is if
> you have a committer that really wants to see someone specific work on
> a patch, it just acts as a strong signal that there is someone
> endorsed to work on that patch. That doesn't mean no one else can
> submit a patch, but it is IMO more of a warning that there may be
> existing work which is likely to be high quality, to avoid duplicated
> effort.
> 
> When it was really easy to assign features to themselves, I saw a lot
> of anti-patterns in the community that seemed unhealthy, specifically:
> 
> - It was really unclear what it means semantically if someone is
> assigned to a JIRA.
> - People assign JIRA's to themselves that aren't a good fit, given the
> authors level of experience.
> - People expect if they assign JIRA's to themselves that others won't
> submit patches, and become upset if they do.
> - People are discouraged from working on a patch because someone else
> was officially assigned.
> 
> - Patrick
> 
> On Wed, Apr 22, 2015 at 11:13 AM, Sean Owen  wrote:
>> Anecdotally, there are a number of people asking to set the Assignee
>> field. This is currently restricted to Committers in JIRA. I know the
>> logic was to prevent people from Assigning a JIRA and then leaving it;
>> it also matters a bit for questions of "credit".
>> 
>> Still I wonder if it's best to just let people go ahead and set it, as
>> the lesser "evil". People can already do a lot like resolve JIRAs and
>> set shepherd and critical priority and all that.
>> 
>> I think the intent was to let "Developers" set this, but maybe due to
>> an error, that's not how the current JIRA permission is implemented.
>> 
>> I ask because I'm about to ping INFRA to update our scheme.
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
> 


---

Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli

If it is true what you say, what is the reason for this 
committer-only-assigns-JIRA tickets policy? If anyone can send a pull request, 
anyone should be able to assign tickets to himself/herself too.

+Vinod

On Apr 22, 2015, at 1:18 PM, Reynold Xin 
mailto:r...@databricks.com>> wrote:

Woh hold on a minute.

Spark has been among the projects that are the most welcoming to new 
contributors. And thanks to this, the sheer number of activities in Spark is 
much larger than other projects, and our workflow has to accommodate this fact.

In practice, people just create pull requests on github, which is a newer & 
friendlier & better model given the constraints. We even have tools that 
automatically tags a ticket with a link to the pull requests.


On Wed, Apr 22, 2015 at 1:11 PM, Vinod Kumar Vavilapalli 
mailto:vino...@hortonworks.com>> wrote:
Actually what this community got away with is pretty much an anti-pattern 
compared to every other Apache project I have seen. And may I say in a not so 
Apache way.

Waiting for a committer to assign a patch to someone leaves it as a privilege 
to a committer. Not alluding to anything fishy in practice, but this also 
leaves a lot of open ground for self-interest. Committers defining notions of 
good fit / level of experience do not work, highly subjective and lead to group 
control.

In terms of semantics, here is what most other projects (dare I say every 
Apache project?) that I have seen do
 - A new contributor comes in who is not yet added to the JIRA project. He/she 
requests one of the project's JIRA admins to add him/her.
 - After that, he or she is free to assign tickets to themselves.
 - What this means
-- Assigning a ticket to oneself is a signal to the rest of the community 
that he/she is actively working on the said patch.
-- If multiple contributors want to work on the same patch, it needs to 
resolved amicably through open communication. On JIRA, or on mailing lists. Not 
by the whim of a committer.
 - Common issues
-- Land grabbing: Other contributors can nudge him/her in case of 
inactivity and take them over. Again, amicably instead of a committer making 
subjective decisions.
-- Progress stalling: One contributor assigns the ticket to himself/herself 
is actively debating but with no real code/docs contribution or with any real 
intention of making progress. Here workable, reviewable code for review usually 
wins.

Assigning patches is not a privilege. Contributors at Apache are a bunch of 
volunteers, the PMC should let volunteers contribute as they see fit. We do not 
assign work at Apache.

+Vinod

On Apr 22, 2015, at 12:32 PM, Patrick Wendell 
mailto:pwend...@gmail.com>> wrote:

> One over arching issue is that it's pretty unclear what "Assigned to
> X" in JIAR means from a process perspective. Personally I actually
> feel it's better for this to be more historical - i.e. who ended up
> submitting a patch for this feature that was merged - rather than
> creating an exclusive reservation for a particular user to work on
> something.
>
> If an issue is "assigned" to person X, but some other person Y submits
> a great patch for it, I think we have some obligation to Spark users
> and to the community to merge the better patch. So the idea of
> reserving the right to add a feature, it just seems overall off to me.
> IMO, its fine if multiple people want to submit competing patches for
> something, provided everyone comments on JIRA saying they are
> intending to submit a patch, and everyone understands there is
> duplicate effort. So commenting with an intention to submit a patch,
> IMO seems like the healthiest workflow since it is non exclusive.
>
> To me the main benefit of "assigning" something ahead of time is if
> you have a committer that really wants to see someone specific work on
> a patch, it just acts as a strong signal that there is someone
> endorsed to work on that patch. That doesn't mean no one else can
> submit a patch, but it is IMO more of a warning that there may be
> existing work which is likely to be high quality, to avoid duplicated
> effort.
>
> When it was really easy to assign features to themselves, I saw a lot
> of anti-patterns in the community that seemed unhealthy, specifically:
>
> - It was really unclear what it means semantically if someone is
> assigned to a JIRA.
> - People assign JIRA's to themselves that aren't a good fit, given the
> authors level of experience.
> - People expect if they assign JIRA's to themselves that others won't
> submit patches, and become upset if they do.
> - People are discouraged from working on a patch because someone else
> was officially assigned.
>
> - Patrick
>
> On Wed, Apr 22, 2015 at 11:13 AM, Sean Owen 
> mailto:so...@cloudera.com>>

Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli

I watch these lists, so I have a fair understanding of how things work around 
here. I don't give direct input in the day to day activities though, like Greg 
Stein on the other thread, so I can understand if it looks like it came from up 
above. Apache Members come around and give opinions time to time, you don't 
need to take it as somebody up above forcing things down.

Thanks
+Vinod

On Apr 22, 2015, at 2:33 PM, Nicholas Chammas 
mailto:nicholas.cham...@gmail.com>> wrote:

I want to take this opportunity to call out the approach to communication you 
took here.

As a random contributor to Spark and active participant on this list, my 
reaction when I read your email was this:

  *   You do not know how the Spark community actually works.
  *   You read a thread that contains some trigger phrases.
  *   You wrote a lengthy response as a knee-jerk reaction.

I’m not trying to mock, but I want to be direct and honest about how you came 
off in this thread to me and probably many others.

Why not ask questions first—many questions? Why not make doubly sure that you 
understand the situation correctly before responding?

In many ways this is much like filing a bug report. “I’m seeing this. It seems 
wrong to me. Is this expected?” I think we all know from experience that this 
kind of bug report is polite and will likely lead to a productive discussion. 
On the other hand: “You’re returning a -1 here? This is obviously wrong! And, 
boy, lemme tell you how wrong you are!!!” No-one likes to deal with bug reports 
like this. More importantly, they get in the way of fixing the actual problem, 
if there is one.

This is not about the Apache Way or not. It’s about basic etiquette and 
effective communication.

I understand that there are legitimate potential concerns here, and it’s 
important that, as an Apache project, Spark work according to Apache 
principles. But when some person who has never participated on this list pops 
up out of nowhere with a lengthy lecture on the Apache Way and whatnot, I have 
to say that that is not an effective way to communicate. Pretty much the same 
thing happened with Greg Stein on an earlier thread some months ago about 
designating maintainers for components.

The concerns are legitimate, I’m sure, and we want to keep Spark in line with 
the Apache Way. And certainly, there have been many times when a project veered 
off course and needed to corrected.

But when we want to make things right, I hope we can do it in a way that 
respectfully and tactfully engages the community. These “lectures delivered 
from above” — which is how they come off — are not helpful.

Nick


Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli
d the response I received
>>> was generally ³go ahead and give it a shot, just understand that this is
>>> sensitive code so we may end up modifying the PR substantially.² Honestly,
>>> that seems fine, and in general, I think it¹s completely fair to go with
>>> the PR model - e.g. If a JIRA has an open PR then it¹s an active effort,
>>> otherwise it¹s fair game unless otherwise stated. At the end of the day,
>>> it¹s about moving the project forward and the only way to do that is to
>>> have actual code in the pipes -speculation and intent don¹t really help,
>>> and there¹s nothing preventing an interested party from submitting a PR
>>> against an issue.
>>> 
>>> Thank you,
>>> Ilya Ganelin
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 4/22/15, 1:25 PM, "Mark Hamstra"  wrote:
>>> 
>>>> Agreed.  The Spark project and community that Vinod describes do not
>>>> resemble the ones with which I am familiar.
>>>> 
>>>> On Wed, Apr 22, 2015 at 1:20 PM, Patrick Wendell 
>>>> wrote:
>>>> 
>>>>> Hi Vinod,
>>>>> 
>>>>> Thanks for you thoughts - However, I do not agree with your sentiment
>>>>> and implications. Spark is broadly quite an inclusive project and we
>>>>> spend a lot of effort culturally to help make newcomers feel welcome.
>>>>> 
>>>>> - Patrick
>>>>> 
>>>>> On Wed, Apr 22, 2015 at 1:11 PM, Vinod Kumar Vavilapalli
>>>>>  wrote:
>>>>>> Actually what this community got away with is pretty much an
>>>>> anti-pattern compared to every other Apache project I have seen. And
>>>>> may I
>>>>> say in a not so Apache way.
>>>>>> 
>>>>>> Waiting for a committer to assign a patch to someone leaves it as a
>>>>> privilege to a committer. Not alluding to anything fishy in practice,
>>>>> but
>>>>> this also leaves a lot of open ground for self-interest. Committers
>>>>> defining notions of good fit / level of experience do not work, highly
>>>>> subjective and lead to group control.
>>>>>> 
>>>>>> In terms of semantics, here is what most other projects (dare I say
>>>>> every Apache project?) that I have seen do
>>>>>> - A new contributor comes in who is not yet added to the JIRA
>>>>> project.
>>>>> He/she requests one of the project's JIRA admins to add him/her.
>>>>>> - After that, he or she is free to assign tickets to themselves.
>>>>>> - What this means
>>>>>>-- Assigning a ticket to oneself is a signal to the rest of the
>>>>> community that he/she is actively working on the said patch.
>>>>>>-- If multiple contributors want to work on the same patch, it
>>>>> needs
>>>>> to resolved amicably through open communication. On JIRA, or on mailing
>>>>> lists. Not by the whim of a committer.
>>>>>> - Common issues
>>>>>>-- Land grabbing: Other contributors can nudge him/her in case of
>>>>> inactivity and take them over. Again, amicably instead of a committer
>>>>> making subjective decisions.
>>>>>>-- Progress stalling: One contributor assigns the ticket to
>>>>> himself/herself is actively debating but with no real code/docs
>>>>> contribution or with any real intention of making progress. Here
>>>>> workable,
>>>>> reviewable code for review usually wins.
>>>>>> 
>>>>>> Assigning patches is not a privilege. Contributors at Apache are a
>>>>> bunch
>>>>> of volunteers, the PMC should let volunteers contribute as they see
>>>>> fit. We
>>>>> do not assign work at Apache.
>>>>>> 
>>>>>> +Vinod
>>>>>> 
>>>>>> On Apr 22, 2015, at 12:32 PM, Patrick Wendell 
>>>>> wrote:
>>>>>> 
>>>>>>> One over arching issue is that it's pretty unclear what "Assigned to
>>>>>>> X" in JIAR means from a process perspective. Personally I actually
>>>>>>> feel it's better for this to be more historical - i.e. who ended up
>>>>>>> submitting a patch for this feature that was merged - rather 

Re: [discuss] ending support for Java 6?

2015-04-30 Thread Vinod Kumar Vavilapalli
FYI, after enough consideration, we the Hadoop community dropped support for 
JDK 6 starting release Apache Hadoop 2.7.x.

Thanks
+Vinod

On Apr 30, 2015, at 12:02 PM, Reynold Xin  wrote:

> This has been discussed a few times in the past, but now Oracle has ended
> support for Java 6 for over a year, I wonder if we should just drop Java 6
> support.
> 
> There is one outstanding issue Tom has brought to my attention: PySpark on
> YARN doesn't work well with Java 7/8, but we have an outstanding pull
> request to fix that.
> 
> https://issues.apache.org/jira/browse/SPARK-6869
> https://issues.apache.org/jira/browse/SPARK-1920


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Designating maintainers for some Spark components

2014-11-06 Thread Vinod Kumar Vavilapalli
> With the maintainer model, the process is as follows:
> 
> - Any committer could review the patch and merge it, but they would need to 
> forward it to me (or another core API maintainer) to make sure we also approve
> - At any point during this process, I could come in and -1 it, or give 
> feedback
> - In addition, any other committer beyond me is still allowed to -1 this patch
> 
> The only change in this model is that committers are responsible to forward 
> patches in these areas to certain other committers. If every committer had 
> perfect oversight of the project, they could have also seen every patch to 
> their component on their own, but this list ensures that they see it even if 
> they somehow overlooked it.


Having done the job of playing an informal 'maintainer' of a project myself, 
this is what I think you really need:

The so called 'maintainers' do one of the below
 - Actively poll the lists and watch over contributions. And follow what is 
repeated often around here: Trust but verify.
 - Setup automated mechanisms to send all bug-tracker updates of a specific 
component to a list that people can subscribe to

And/or
 - Individual contributors send review requests to unofficial 'maintainers' 
over dev-lists or through tools. Like many projects do with review boards and 
other tools.

Note that none of the above is a required step. It must not be, that's the 
point. But once set as a convention, they will all help you address your 
concerns with project scalability.

Anything else that you add is bestowing privileges to a select few and forming 
dictatorships. And contrary to what the proposal claims, this is neither 
scalable nor confirming to Apache governance rules.

+Vinod


signature.asc
Description: Message signed with OpenPGP using GPGMail