I am glad that it was not only what I was thinking.
I also do agree with Holden, Sean and Cody. All I wanted to say were all
said.



2016-10-08 1:16 GMT+09:00 Holden Karau <hol...@pigscanfly.ca>:

> First off, thanks Cody for taking the time to put together these proposals
> - I think it has kicked off some wonderful discussion.
>
> I think dismissing people's complaints with Spark as largely trolls does
> us a disservice, it’s important for us to recognize our own shortcomings -
> otherwise we are blind to the weak spots where we need to improve and
> instead focus on new features. Parts of the Python community seem to be
> actively looking for alternatives, and I’d obviously like Spark continue to
> be the place where we come together and collaborate from different
> languages.
>
> I’d be more than happy to do a review of the outstanding Python PRs (I’ve
> been keeping on top of the new ones but largely haven’t looked at the older
> ones) and if there is a committer (maybe Davies or Sean?) who would be able
> to help out with merging them once they are ready that would be awesome.
> I’m at PyData DC this weekend but I’ll also start going through some of the
> older Python JIRAs and seeing if they are still relevant, already fixed, or
> something we are unlikely to be interested in bringing into Spark.
>
> I’m giving a talk later on this month on how to get started contributing
> to Apache Spark at OSCON London, and when I’ve given this talk before I’ve
> had to include a fair number of warnings about the challenges that can face
> a new contributor. I’d love to be able to drop those in future versions :)
>
> P.S.
>
> As one of the non-committers who has been working on Spark for several
> years (see http://bit.ly/hkspmg ) I have strong feelings around the
> current process being used for committers - but since I’m not on the PMC
> (catch-22 style) it's difficult to have any visibility into the process, so
> someone who does will have to weigh in on that :)
>
>
> On Fri, Oct 7, 2016 at 8:00 AM, Cody Koeninger <c...@koeninger.org> wrote:
>
>> Sean, that was very eloquently put, and I 100% agree.  If I ever meet
>> you in person, I'll buy you multiple rounds of beverages of your
>> choice ;)
>> This is probably reiterating some of what you said in a less clear
>> manner, but I'll throw more of my 2 cents in.
>>
>> - Design.
>> Yes, design by committee doesn't work.  The best designs are when a
>> person who understands the problem builds something that works for
>> them, shares with others, and most importantly iterates when it
>> doesn't work for others.  This iteration only works if you're willing
>> to change interfaces, but committer and user goals are not aligned
>> here.  Users want something that is clearly documented and helps them
>> get their job done.  Committers (not all) want to minimize interface
>> change, even at the expense of users being able to do their jobs.  In
>> this situation, it is critical that you understand early what users
>> need to be able to do.  This is what the improvement proposal process
>> should focus on: Goals, non-goals, possible solutions, rejected
>> solutions.  Not class-level design.  Most importantly, it needs a
>> clear, unambiguous outcome that is visible to the public.
>>
>> - Trolling
>> It's not just trolling.  Event time and kafka are technically
>> important and should not be ignored.  I've been banging this drum for
>> years.  These concerns haven't been fully heard and understood by
>> committers.  This one example of why diversity of enfranchised users
>> is important and governance concerns shouldn't be ignored.
>>
>> - Jira
>> Concretely, automate closing stale jiras after X amount of time.  It's
>> really surprising to me how much reluctance a community of programmers
>> have shown towards automating their own processes around stuff like
>> this (not to mention automatic code formatting of modified files).  I
>> understand the arguments against. but the current alternative doesn't
>> work.
>> Concretely, clearly reject and close jiras.  I have a backlog of 50+
>> kafka jiras, many of which are irrelevant at this point, but I do not
>> feel that I have the political power to close them.
>> Concretely, make it clear who is working on something.  This can be as
>> simple as just "I'm working on this", assign it to me, if I don't
>> follow up in X amount of time, close it or reassign.  That doesn't
>> mean there can't be competing work, but it does mean those people
>> should talk to each other.  Conversely, if committers currently don't
>> have time to work on something that is important, make that clear in
>> the ticket.
>>
>>
>> On Fri, Oct 7, 2016 at 5:34 AM, Sean Owen <so...@cloudera.com> wrote:
>> > Suggestion actions way at the bottom.
>> >
>> > On Fri, Oct 7, 2016 at 5:14 AM Matei Zaharia <matei.zaha...@gmail.com>
>> > wrote:
>> >>
>> >> since March. But it's true that other things such as the Kafka source
>> for
>> >> it didn't have as much design on JIRA. Nonetheless, this component is
>> still
>> >> early on and there's still a lot of time to change it, which is
>> happening.
>> >
>> >
>> > It's hard to drive design discussions in OSS. Even when diligently
>> > publishing design docs, the doc happens after brainstorming, and that
>> > happens inside someone's head or in chats.
>> >
>> > The lazy consensus model that works for small changes doesn't work well
>> > here. If a committer wants a change, that change will basically be made
>> > modulo small edits; vetoes are for dire disagreement. (Otherwise we'd
>> get
>> > nothing done.) However this model means it's hard to significantly
>> change a
>> > design after draft 1.
>> >
>> > I've heard this complaint a few times, and it has never been down to bad
>> > faith. We should err further towards over-including early and often.
>> I've
>> > seen some great discussions start more with a problem statement and an
>> RFC,
>> > not a design doc. Keeping regular contributors enfranchised is
>> essential, so
>> > that they're willing and able to participate when design time comes.
>> (See
>> > below.)
>> >
>> >
>> >>
>> >> 2) About what people say at Reactive Summit -- there will always be
>> >> trolls, but just ignore them and build a great project. Those of us
>> involved
>> >> in the project for a while have long seen similar stuff, e.g. a
>> >
>> >
>> > The hype cycle may be turning against Spark, as is normal for this
>> stage of
>> > maturity. People idealize technologies they don't really use as greener
>> > grass; it's the things they use and need to work that they love to hate.
>> >
>> > I would not dismiss this as just trolling. Customer anecdotes I see
>> suggest
>> > that Spark underperforms their (inflated) expectations, and generally
>> does
>> > not Just Work. It takes expertise, tuning, patience, workarounds. And
>> then
>> > it gets great things done. I do see a gap between how the group here
>> talks
>> > about the technology, and how the users I see talk about it. The gap
>> > manifests in attention given to making yet more things, and attention
>> given
>> > to fixing and project mechanics.
>> >
>> > I would also not dismiss criticism of governance. We can recognize some
>> big
>> > problems that were resolved over even the past 3 months. Usually I hear,
>> > well, we do better than most projects, right? and that is true. But,
>> Spark
>> > is bigger and busier than most any other project. Exceptional projects
>> need
>> > exceptional governance and we have merely "good". See next.
>> >
>> >
>> >> 3) About number and diversity of committers -- the PMC is always
>> working
>> >> to expand these, and you should email people on the PMC (or even the
>> whole
>> >> list) if you have people you'd like to propose. In
>> >
>> >
>> > If you're suggesting that it's mostly a matter of asking, then this
>> doesn't
>> > match my experience. I have seen a few people consistently soft-reject
>> most
>> > proposals. The reasons given usually sound like "concerns about
>> quality",
>> > which is probably the right answer to a somewhat wrong question.
>> >
>> > We should probably be asking primarily who will net-net add efficiency
>> to
>> > some part of the project's mechanics. Per above, it wouldn't hurt to
>> ask who
>> > would expand coverage and add diversity of perspective too.
>> >
>> > I disagree that committers are being added at a sufficient rate. The
>> overall
>> > committer-attention hours is dropping as the project grows -- am I the
>> only
>> > one that perceives many regular committers aren't working nearly as
>> much as
>> > before on the project?
>> >
>> > I call it a problem because we have IMHO people who 'qualify', and not
>> > giving them some stake is going to cost the project down the road.
>> Always Be
>> > Recruiting. This is what I would worry about, since the governance and
>> > enfranchisement issues above kind of stem from this.
>> >
>> >
>> >>
>> >> 4) Finally, about better organizing JIRA, marking dead issues, etc,
>> this
>> >> would be great and I think we just need a concrete proposal for how to
>> do
>> >> it. It would be best to point to an existing process that someone else
>> has
>> >> used here BTW so that we can see it in action.
>> >
>> >
>> > I don't think we're wanting for proposals. I went on and on about it
>> last
>> > year, and don't think anyone disagreed about actions. I wouldn't suggest
>> > that clearing out dead issues is more complex than just putting in time
>> to
>> > do it. It's just grunt work and understandably not appealing. (Thank you
>> > Xiao for your recent run at SQL JIRAs.)
>> >
>> > It requires saying 'no', which is hard, because it requires some
>> conviction.
>> > I have encountered reluctance to do this in Spark and think that culture
>> > should change. Is it weird to say that a broader group of gatekeepers
>> can
>> > actually with more confidence and efficiency tackle the triage issue?
>> that
>> > pushing back on 'bad' contribution actually increases the rate of
>> 'good'?
>> >
>> > FWIW I also find the project unpleasant to deal with day to day, mostly
>> > because of the scale of the triage, and think we could use all the
>> qualified
>> > help we can get. I am looking to do less with the project over time,
>> which
>> > is no big deal in itself, but is a big deal if these several factors are
>> > adding up to discourage fresh blood from joining the fray. Cody makes me
>> > think there are, at least, 2 of us.
>> >
>> > Concrete steps?
>> >
>> > Go to spark-prs.com. Look at "Users". Look at your open PRs. Are any
>> stale?
>> > can you close them or advance them?
>> >
>> > Look at the Stale PRs tab and sort by last updated. Do any look dead?
>> can
>> > you ask the author to update or close? does the parent JIRA look like
>> it's
>> > not otherwise relevant?
>> >
>> > Go download JIRA Client at http://almworks.com/jiraclient/download.html
>> Go
>> > look at all open JIRAs sorted by last update. Are any pretty obviously
>> > obsolete?
>> >
>> > If you don't feel comfortable acting, feel free to at least propose a
>> list
>> > to dev@ for a look.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>

Reply via email to