Re: Improving governance / committers (split from Spark Improvement Proposals thread)

Cody Koeninger Sat, 08 Oct 2016 17:04:23 -0700

It's not about technical design disagreement as to matters of taste,
it's about familiarity with the domain.  To make an analogy, it's as
if a committer in MLlib was firmly intent on, I dunno, treating a
collection of categorical variables as if it were an ordered range of
continuous variables.  It's just wrong.  That kind of thing, to a
greater or lesser degree, has been going on related to the Kafka
modules, for years.


On Sat, Oct 8, 2016 at 4:11 PM, Matei Zaharia <matei.zaha...@gmail.com> wrote:
> This makes a lot of sense; just to comment on a few things:
>
>> - More committers
>> Just looking at the ratio of committers to open tickets, or committers
>> to contributors, I don't think you have enough human power.
>> I realize this is a touchy issue.  I don't have dog in this fight,
>> because I'm not on either coast nor in a big company that views
>> committership as a political thing.  I just think you need more people
>> to do the work, and more diversity of viewpoint.
>> It's unfortunate that the Apache governance process involves giving
>> someone all the keys or none of the keys, but until someone really
>> starts screwing up, I think it's better to err on the side of
>> accepting hard-working people.
>
> This is something the PMC is actively discussing. Historically, we've added 
> committers when people contributed a new module or feature, basically to the 
> point where other developers are asking them to review changes in that area 
> (https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-BecomingaCommitter).
>  For example, we added the original authors of GraphX when we merged in 
> GraphX, the authors of new ML algorithms, etc. However, there's a good 
> argument that some areas are simply not covered well now and we should add 
> people there. Also, as the project has grown, there are also more people who 
> focus on smaller fixes and are nonetheless contributing a lot.
>
>> - Each major area of the code needs at least one person who cares
>> about it that is empowered with a vote, otherwise decisions get made
>> that don't make technical sense.
>> I don't know if anyone with a vote is shepherding GraphX (or maybe
>> it's just dead), the Mesos relationship has always been weird, no one
>> with a vote really groks Kafka.
>> marmbrus and zsxwing are getting there quickly on the Kafka side, and
>> I appreciate it, but it's been bad for a while.
>> Because I don't have any political power, my response to seeing things
>> that I know are technically dangerous has been to yell really loud
>> until someone listens, which sucks for everyone involved.
>> I already apologized to Michael privately; Ryan, I'm sorry, it's not about 
>> you.
>> This seems pretty straightforward to fix, if politically awkward:
>> those people exist, just give them a vote.
>> Failing that, listen the first or second time they say something not
>> the third or fourth, and if it doesn't make sense, ask.
>
> Just as a note here -- it's true that some areas are not super well covered, 
> but I also hope to avoid a situation where people have to yell to be listened 
> to. I can't say anything about *all* technical discussions we've ever had, 
> but historically, people have been able to comment on the design of many 
> things without yelling. This is actually important because a culture of 
> having to yell can drive away contributors. So it's awesome that you yelled 
> about the Kafka source stuff, but at the same time, hopefully we make these 
> types of things work without yelling. This would be a problem even if there 
> were committers with more expertise in each area -- what if someone disagrees 
> with the committers?
>
> Matei
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Improving governance / committers (split from Spark Improvement Proposals thread)

Reply via email to