Re: Spark Improvement Proposals

Cody Koeninger Mon, 10 Oct 2016 16:16:22 -0700

If someone wants to tell me that it's OK and "The Apache Way" for
Kafka and Flink to have a proposal process that ends in a lazy
majority, but it's not OK for Spark to have a proposal process that
ends in a non-lazy consensus...


https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals#KafkaImprovementProposals-Process

In practice any PMC member can stop a proposal they don't like, so I'm
not sure how much it matters.



On Mon, Oct 10, 2016 at 5:59 PM, Mark Hamstra <m...@clearstorydata.com> wrote:
> There is a larger issue to keep in mind, and that is that what you are
> proposing is a procedure that, as far as I am aware, hasn't previously been
> adopted in an Apache project, and thus is not an easy or exact fit with
> established practices that have been blessed as "The Apache Way".  As such,
> we need to be careful, because we have run into some trouble in the past
> with some inside the ASF but essentially outside the Spark community who
> didn't like the way we were doing things.
>
> On Mon, Oct 10, 2016 at 3:53 PM, Cody Koeninger <c...@koeninger.org> wrote:
>>
>> Apache documents say lots of confusing stuff, including that commiters are
>> in practice given a vote.
>>
>> https://www.apache.org/foundation/voting.html
>>
>> I don't care either way, if someone wants me to sub commiter for PMC in
>> the voting section, fine, we just need a clear outcome.
>>
>>
>> On Oct 10, 2016 17:36, "Mark Hamstra" <m...@clearstorydata.com> wrote:
>>>
>>> If I'm correctly understanding the kind of voting that you are talking
>>> about, then to be accurate, it is only the PMC members that have a vote, not
>>> all committers:
>>> https://www.apache.org/foundation/how-it-works.html#pmc-members
>>>
>>> On Mon, Oct 10, 2016 at 12:02 PM, Cody Koeninger <c...@koeninger.org>
>>> wrote:
>>>>
>>>> I think the main value is in being honest about what's going on.  No
>>>> one other than committers can cast a meaningful vote, that's the
>>>> reality.  Beyond that, if people think it's more open to allow formal
>>>> proposals from anyone, I'm not necessarily against it, but my main
>>>> question would be this:
>>>>
>>>> If anyone can submit a proposal, are committers actually going to
>>>> clearly reject and close proposals that don't meet the requirements?
>>>>
>>>> Right now we have a serious problem with lack of clarity regarding
>>>> contributions, and that cannot spill over into goal-setting.
>>>>
>>>> On Mon, Oct 10, 2016 at 1:54 PM, Ryan Blue <rb...@netflix.com> wrote:
>>>> > +1 to votes to approve proposals. I agree that proposals should have
>>>> > an
>>>> > official mechanism to be accepted, and a vote is an established means
>>>> > of
>>>> > doing that well. I like that it includes a period to review the
>>>> > proposal and
>>>> > I think proposals should have been discussed enough ahead of a vote to
>>>> > survive the possibility of a veto.
>>>> >
>>>> > I also like the names that are short and (mostly) unique, like SEP.
>>>> >
>>>> > Where I disagree is with the requirement that a committer must
>>>> > formally
>>>> > propose an enhancement. I don't see the value of restricting this: if
>>>> > someone has the will to write up a proposal then they should be
>>>> > encouraged
>>>> > to do so and start a discussion about it. Even if there is a political
>>>> > reality as Cody says, what is the value of codifying that in our
>>>> > process? I
>>>> > think restricting who can submit proposals would only undermine them
>>>> > by
>>>> > pushing contributors out. Maybe I'm missing something here?
>>>> >
>>>> > rb
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Oct 10, 2016 at 7:41 AM, Cody Koeninger <c...@koeninger.org>
>>>> > wrote:
>>>> >>
>>>> >> Yes, users suggesting SIPs is a good thing and is explicitly called
>>>> >> out in the linked document under the Who? section.  Formally
>>>> >> proposing
>>>> >> them, not so much, because of the political realities.
>>>> >>
>>>> >> Yes, implementation strategy definitely affects goals.  There are all
>>>> >> kinds of examples of this, I'll pick one that's my fault so as to
>>>> >> avoid sounding like I'm blaming:
>>>> >>
>>>> >> When I implemented the Kafka DStream, one of my (not explicitly
>>>> >> agreed
>>>> >> upon by the community) goals was to make sure people could use the
>>>> >> Dstream with however they were already using Kafka at work.  The lack
>>>> >> of explicit agreement on that goal led to all kinds of fighting with
>>>> >> committers, that could have been avoided.  The lack of explicit
>>>> >> up-front strategy discussion led to the DStream not really working
>>>> >> with compacted topics.  I knew about compacted topics, but don't have
>>>> >> a use for them, so had a blind spot there.  If there was explicit
>>>> >> up-front discussion that my strategy was "assume that batches can be
>>>> >> defined on the driver solely by beginning and ending offsets",
>>>> >> there's
>>>> >> a greater chance that a user would have seen that and said, "hey,
>>>> >> what
>>>> >> about non-contiguous offsets in a compacted topic".
>>>> >>
>>>> >> This kind of thing is only going to happen smoothly if we have a
>>>> >> lightweight user-visible process with clear outcomes.
>>>> >>
>>>> >> On Mon, Oct 10, 2016 at 1:34 AM, assaf.mendelson
>>>> >> <assaf.mendel...@rsa.com> wrote:
>>>> >> > I agree with most of what Cody said.
>>>> >> >
>>>> >> > Two things:
>>>> >> >
>>>> >> > First we can always have other people suggest SIPs but mark them as
>>>> >> > “unreviewed” and have committers basically move them forward. The
>>>> >> > problem is
>>>> >> > that writing a good document takes time. This way we can leverage
>>>> >> > non
>>>> >> > committers to do some of this work (it is just another way to
>>>> >> > contribute).
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > As for strategy, in many cases implementation strategy can affect
>>>> >> > the
>>>> >> > goals.
>>>> >> > I will give  a small example: In the current structured streaming
>>>> >> > strategy,
>>>> >> > we group by the time to achieve a sliding window. This is
>>>> >> > definitely an
>>>> >> > implementation decision and not a goal. However, I can think of
>>>> >> > several
>>>> >> > aggregation functions which have the time inside their calculation
>>>> >> > buffer.
>>>> >> > For example, let’s say we want to return a set of all distinct
>>>> >> > values.
>>>> >> > One
>>>> >> > way to implement this would be to make the set into a map and have
>>>> >> > the
>>>> >> > value
>>>> >> > contain the last time seen. Multiplying it across the groupby would
>>>> >> > cost
>>>> >> > a
>>>> >> > lot in performance. So adding such a strategy would have a great
>>>> >> > effect
>>>> >> > on
>>>> >> > the type of aggregations and their performance which does affect
>>>> >> > the
>>>> >> > goal.
>>>> >> > Without adding the strategy, it is easy for whoever goes to the
>>>> >> > design
>>>> >> > document to not think about these cases. Furthermore, it might be
>>>> >> > decided
>>>> >> > that these cases are rare enough so that the strategy is still good
>>>> >> > enough
>>>> >> > but how would we know it without user feedback?
>>>> >> >
>>>> >> > I believe this example is exactly what Cody was talking about.
>>>> >> > Since
>>>> >> > many
>>>> >> > times implementation strategies have a large effect on the goal, we
>>>> >> > should
>>>> >> > have it discussed when discussing the goals. In addition, while it
>>>> >> > is
>>>> >> > often
>>>> >> > easy to throw out completely infeasible goals, it is often much
>>>> >> > harder
>>>> >> > to
>>>> >> > figure out that the goals are unfeasible without fine tuning.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > Assaf.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > From: Cody Koeninger-2 [via Apache Spark Developers List]
>>>> >> > [mailto:ml-node+[hidden email]]
>>>> >> > Sent: Monday, October 10, 2016 2:25 AM
>>>> >> > To: Mendelson, Assaf
>>>> >> > Subject: Re: Spark Improvement Proposals
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > Only committers should formally submit SIPs because in an apache
>>>> >> > project only commiters have explicit political power.  If a user
>>>> >> > can't
>>>> >> > find a commiter willing to sponsor an SIP idea, they have no way to
>>>> >> > get the idea passed in any case.  If I can't find a committer to
>>>> >> > sponsor this meta-SIP idea, I'm out of luck.
>>>> >> >
>>>> >> > I do not believe unrealistic goals can be found solely by
>>>> >> > inspection.
>>>> >> > We've managed to ignore unrealistic goals even after
>>>> >> > implementation!
>>>> >> > Focusing on APIs can allow people to think they've solved
>>>> >> > something,
>>>> >> > when there's really no way of implementing that API while meeting
>>>> >> > the
>>>> >> > goals.  Rapid iteration is clearly the best way to address this,
>>>> >> > but
>>>> >> > we've already talked about why that hasn't really worked.  If
>>>> >> > adding a
>>>> >> > non-binding API section to the template is important to you, I'm
>>>> >> > not
>>>> >> > against it, but I don't think it's sufficient.
>>>> >> >
>>>> >> > On your PRD vs design doc spectrum, I'm saying this is closer to a
>>>> >> > PRD.  Clear agreement on goals is the most important thing and
>>>> >> > that's
>>>> >> > why it's the thing I want binding agreement on.  But I cannot agree
>>>> >> > to
>>>> >> > goals unless I have enough minimal technical info to judge whether
>>>> >> > the
>>>> >> > goals are likely to actually be accomplished.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > On Sun, Oct 9, 2016 at 5:35 PM, Matei Zaharia <[hidden email]>
>>>> >> > wrote:
>>>> >> >
>>>> >> >
>>>> >> >> Well, I think there are a few things here that don't make sense.
>>>> >> >> First,
>>>> >> >> why
>>>> >> >> should only committers submit SIPs? Development in the project
>>>> >> >> should
>>>> >> >> be
>>>> >> >> open to all contributors, whether they're committers or not.
>>>> >> >> Second, I
>>>> >> >> think
>>>> >> >> unrealistic goals can be found just by inspecting the goals, and
>>>> >> >> I'm
>>>> >> >> not
>>>> >> >> super worried that we'll accept a lot of SIPs that are then
>>>> >> >> infeasible
>>>> >> >> --
>>>> >> >> we
>>>> >> >> can then submit new ones. But this depends on whether you want
>>>> >> >> this
>>>> >> >> process
>>>> >> >> to be a "design doc lite", where people also agree on
>>>> >> >> implementation
>>>> >> >> strategy, or just a way to agree on goals. This is what I asked
>>>> >> >> earlier
>>>> >> >> about PRDs vs design docs (and I'm open to either one but I'd just
>>>> >> >> like
>>>> >> >> clarity). Finally, both as a user and designer of software, I
>>>> >> >> always
>>>> >> >> want
>>>> >> >> to
>>>> >> >> give feedback on APIs, so I'd really like a culture of having
>>>> >> >> those
>>>> >> >> early.
>>>> >> >> People don't argue about prettiness when they discuss APIs, they
>>>> >> >> argue
>>>> >> >> about
>>>> >> >> the core concepts to expose in order to meet various goals, and
>>>> >> >> then
>>>> >> >> they're
>>>> >> >> stuck maintaining those for a long time.
>>>> >> >>
>>>> >> >> Matei
>>>> >> >>
>>>> >> >> On Oct 9, 2016, at 3:10 PM, Cody Koeninger <[hidden email]> wrote:
>>>> >> >>
>>>> >> >> Users instead of people, sure.  Commiters and contributors are (or
>>>> >> >> at
>>>> >> >> least
>>>> >> >> should be) a subset of users.
>>>> >> >>
>>>> >> >> Non goals, sure. I don't care what the name is, but we need to
>>>> >> >> clearly
>>>> >> >> say
>>>> >> >> e.g. 'no we are not maintaining compatibility with XYZ right now'.
>>>> >> >>
>>>> >> >> API, what I care most about is whether it allows me to accomplish
>>>> >> >> the
>>>> >> >> goals.
>>>> >> >> Arguing about how ugly or pretty it is can be saved for design/
>>>> >> >> implementation imho.
>>>> >> >>
>>>> >> >> Strategy, this is necessary because otherwise goals can be out of
>>>> >> >> line
>>>> >> >> with
>>>> >> >> reality.  Don't propose goals you don't have at least some idea of
>>>> >> >> how
>>>> >> >> to
>>>> >> >> implement.
>>>> >> >>
>>>> >> >> Rejected strategies, given that commiters are the only ones I'm
>>>> >> >> saying
>>>> >> >> should formally submit SPARKLIs or SIPs, if they put junk in a
>>>> >> >> required
>>>> >> >> section then slap them down for it and tell them to fix it.
>>>> >> >>
>>>> >> >>
>>>> >> >> On Oct 9, 2016 4:36 PM, "Matei Zaharia" <[hidden email]> wrote:
>>>> >> >>>
>>>> >> >>> Yup, this is the stuff that I found unclear. Thanks for
>>>> >> >>> clarifying
>>>> >> >>> here,
>>>> >> >>> but we should also clarify it in the writeup. In particular:
>>>> >> >>>
>>>> >> >>> - Goals needs to be about user-facing behavior ("people" is
>>>> >> >>> broad)
>>>> >> >>>
>>>> >> >>> - I'd rename Rejected Goals to Non-Goals. Otherwise someone will
>>>> >> >>> dig
>>>> >> >>> up
>>>> >> >>> one of these and say "Spark's developers have officially rejected
>>>> >> >>> X,
>>>> >> >>> which
>>>> >> >>> our awesome system has".
>>>> >> >>>
>>>> >> >>> - For user-facing stuff, I think you need a section on API.
>>>> >> >>> Virtually
>>>> >> >>> all
>>>> >> >>> other *IPs I've seen have that.
>>>> >> >>>
>>>> >> >>> - I'm still not sure why the strategy section is needed if the
>>>> >> >>> purpose
>>>> >> >>> is
>>>> >> >>> to define user-facing behavior -- unless this is the strategy for
>>>> >> >>> setting
>>>> >> >>> the goals or for defining the API. That sounds squarely like a
>>>> >> >>> design
>>>> >> >>> doc
>>>> >> >>> issue. In some sense, who cares whether the proposal is
>>>> >> >>> technically
>>>> >> >>> feasible
>>>> >> >>> right now? If it's infeasible, that will be discovered later
>>>> >> >>> during
>>>> >> >>> design
>>>> >> >>> and implementation. Same thing with rejected strategies --
>>>> >> >>> listing
>>>> >> >>> some
>>>> >> >>> of
>>>> >> >>> those is definitely useful sometimes, but if you make this a
>>>> >> >>> *required*
>>>> >> >>> section, people are just going to fill it in with bogus stuff
>>>> >> >>> (I've
>>>> >> >>> seen
>>>> >> >>> this happen before).
>>>> >> >>>
>>>> >> >>> Matei
>>>> >> >>>
>>>> >> >
>>>> >> >>> > On Oct 9, 2016, at 2:14 PM, Cody Koeninger <[hidden email]>
>>>> >> >>> > wrote:
>>>> >> >>> >
>>>> >> >>> > So to focus the discussion on the specific strategy I'm
>>>> >> >>> > suggesting,
>>>> >> >>> > documented at
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> > https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark-improvement-proposals.md
>>>> >> >>> >
>>>> >> >>> > "Goals: What must this allow people to do, that they can't
>>>> >> >>> > currently?"
>>>> >> >>> >
>>>> >> >>> > Is it unclear that this is focusing specifically on
>>>> >> >>> > people-visible
>>>> >> >>> > behavior?
>>>> >> >>> >
>>>> >> >>> > Rejected goals -  are important because otherwise people keep
>>>> >> >>> > trying
>>>> >> >>> > to argue about scope.  Of course you can change things later
>>>> >> >>> > with a
>>>> >> >>> > different SIP and different vote, the point is to focus.
>>>> >> >>> >
>>>> >> >>> > Use cases - are something that people are going to bring up in
>>>> >> >>> > discussion.  If they aren't clearly documented as a goal ("This
>>>> >> >>> > must
>>>> >> >>> > allow me to connect using SSL"), they should be added.
>>>> >> >>> >
>>>> >> >>> > Internal architecture - if the people who need specific
>>>> >> >>> > behavior are
>>>> >> >>> > implementers of other parts of the system, that's fine.
>>>> >> >>> >
>>>> >> >>> > Rejected strategies - If you have none of these, you have no
>>>> >> >>> > evidence
>>>> >> >>> > that the proponent didn't just go with the first thing they had
>>>> >> >>> > in
>>>> >> >>> > mind (or have already implemented), which is a big problem
>>>> >> >>> > currently.
>>>> >> >>> > Approval isn't binding as to specifics of implementation, so
>>>> >> >>> > these
>>>> >> >>> > aren't handcuffs.  The goals are the contract, the strategy is
>>>> >> >>> > evidence that contract can actually be met.
>>>> >> >>> >
>>>> >> >>> > Design docs - I'm not touching design docs.  The markdown file
>>>> >> >>> > I
>>>> >> >>> > linked specifically says of the strategy section "This is not a
>>>> >> >>> > full
>>>> >> >>> > design document."  Is this unclear?  Design docs can be worked
>>>> >> >>> > on
>>>> >> >>> > obviously, but that's not what I'm concerned with here.
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> > On Sun, Oct 9, 2016 at 2:34 PM, Matei Zaharia <[hidden email]>
>>>> >> >>> > wrote:
>>>> >> >>> >> Hi Cody,
>>>> >> >>> >>
>>>> >> >>> >> I think this would be a lot more concrete if we had a more
>>>> >> >>> >> detailed
>>>> >> >>> >> template
>>>> >> >>> >> for SIPs. Right now, it's not super clear what's in scope --
>>>> >> >>> >> e.g.
>>>> >> >>> >> are
>>>> >> >>> >> they
>>>> >> >>> >> a way to solicit feedback on the user-facing behavior or on
>>>> >> >>> >> the
>>>> >> >>> >> internals?
>>>> >> >>> >> "Goals" can cover both things. I've been thinking of SIPs more
>>>> >> >>> >> as
>>>> >> >>> >> Product
>>>> >> >>> >> Requirements Docs (PRDs), which focus on *what* a code change
>>>> >> >>> >> should
>>>> >> >>> >> do
>>>> >> >>> >> as
>>>> >> >>> >> opposed to how.
>>>> >> >>> >>
>>>> >> >>> >> In particular, here are some things that you may or may not
>>>> >> >>> >> consider
>>>> >> >>> >> in
>>>> >> >>> >> scope for SIPs:
>>>> >> >>> >>
>>>> >> >>> >> - Goals and non-goals: This is definitely in scope, and IMO
>>>> >> >>> >> should
>>>> >> >>> >> focus on
>>>> >> >>> >> user-visible behavior (e.g. "system supports SQL window
>>>> >> >>> >> functions"
>>>> >> >>> >> or
>>>> >> >>> >> "system continues working if one node fails"). BTW I wouldn't
>>>> >> >>> >> say
>>>> >> >>> >> "rejected
>>>> >> >>> >> goals" because some of them might become goals later, so we're
>>>> >> >>> >> not
>>>> >> >>> >> definitively rejecting them.
>>>> >> >>> >>
>>>> >> >>> >> - Public API: Probably should be included in most SIPs unless
>>>> >> >>> >> it's
>>>> >> >>> >> too
>>>> >> >>> >> large
>>>> >> >>> >> to fully specify then (e.g. "let's add an ML library").
>>>> >> >>> >>
>>>> >> >>> >> - Use cases: I usually find this very useful in PRDs to better
>>>> >> >>> >> communicate
>>>> >> >>> >> the goals.
>>>> >> >>> >>
>>>> >> >>> >> - Internal architecture: This is usually *not* a thing users
>>>> >> >>> >> can
>>>> >> >>> >> easily
>>>> >> >>> >> comment on and it sounds more like a design doc item. Of
>>>> >> >>> >> course
>>>> >> >>> >> it's
>>>> >> >>> >> important to show that the SIP is feasible to implement. One
>>>> >> >>> >> exception,
>>>> >> >>> >> however, is that I think we'll have some SIPs primarily on
>>>> >> >>> >> internals
>>>> >> >>> >> (e.g.
>>>> >> >>> >> if somebody wants to refactor Spark's query optimizer or
>>>> >> >>> >> something).
>>>> >> >>> >>
>>>> >> >>> >> - Rejected strategies: I personally wouldn't put this, because
>>>> >> >>> >> what's
>>>> >> >>> >> the
>>>> >> >>> >> point of voting to reject a strategy before you've really
>>>> >> >>> >> begun
>>>> >> >>> >> designing
>>>> >> >>> >> and implementing something? What if you discover that the
>>>> >> >>> >> strategy
>>>> >> >>> >> is
>>>> >> >>> >> actually better when you start doing stuff?
>>>> >> >>> >>
>>>> >> >>> >> At a super high level, it depends on whether you want the SIPs
>>>> >> >>> >> to
>>>> >> >>> >> be
>>>> >> >>> >> PRDs
>>>> >> >>> >> for getting some quick feedback on the goals of a feature
>>>> >> >>> >> before it
>>>> >> >>> >> is
>>>> >> >>> >> designed, or something more like full-fledged design docs
>>>> >> >>> >> (just a
>>>> >> >>> >> more
>>>> >> >>> >> visible design doc for bigger changes). I looked at Kafka's
>>>> >> >>> >> KIPs,
>>>> >> >>> >> and
>>>> >> >>> >> they
>>>> >> >>> >> actually seem to be more like design docs. This can work too
>>>> >> >>> >> but it
>>>> >> >>> >> does
>>>> >> >>> >> require more work from the proposer and it can lead to the
>>>> >> >>> >> same
>>>> >> >>> >> problems you
>>>> >> >>> >> mentioned with people already having a design and
>>>> >> >>> >> implementation in
>>>> >> >>> >> mind.
>>>> >> >>> >>
>>>> >> >>> >> Basically, the question is, are you trying to iterate faster
>>>> >> >>> >> on
>>>> >> >>> >> design
>>>> >> >>> >> by
>>>> >> >>> >> adding a step for user feedback earlier? Or are you just
>>>> >> >>> >> trying to
>>>> >> >>> >> make
>>>> >> >>> >> design docs for key features more visible (and their approval
>>>> >> >>> >> more
>>>> >> >>> >> formal)?
>>>> >> >>> >>
>>>> >> >>> >> BTW note that in either case, I'd like to have a template for
>>>> >> >>> >> design
>>>> >> >>> >> docs
>>>> >> >>> >> too, which should also include goals. I think that would've
>>>> >> >>> >> avoided
>>>> >> >>> >> some of
>>>> >> >>> >> the issues you brought up.
>>>> >> >>> >>
>>>> >> >>> >> Matei
>>>> >> >>> >>
>>>> >> >>> >> On Oct 9, 2016, at 10:40 AM, Cody Koeninger <[hidden email]>
>>>> >> >>> >> wrote:
>>>> >> >>> >>
>>>> >> >>> >> Here's my specific proposal (meta-proposal?)
>>>> >> >>> >>
>>>> >> >>> >> Spark Improvement Proposals (SIP)
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> Background:
>>>> >> >>> >>
>>>> >> >>> >> The current problem is that design and implementation of large
>>>> >> >>> >> features
>>>> >> >>> >> are
>>>> >> >>> >> often done in private, before soliciting user feedback.
>>>> >> >>> >>
>>>> >> >>> >> When feedback is solicited, it is often as to detailed design
>>>> >> >>> >> specifics, not
>>>> >> >>> >> focused on goals.
>>>> >> >>> >>
>>>> >> >>> >> When implementation does take place after design, there is
>>>> >> >>> >> often
>>>> >> >>> >> disagreement as to what goals are or are not in scope.
>>>> >> >>> >>
>>>> >> >>> >> This results in commits that don't fully meet user needs.
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> Goals:
>>>> >> >>> >>
>>>> >> >>> >> - Ensure user, contributor, and committer goals are clearly
>>>> >> >>> >> identified
>>>> >> >>> >> and
>>>> >> >>> >> agreed upon, before implementation takes place.
>>>> >> >>> >>
>>>> >> >>> >> - Ensure that a technically feasible strategy is chosen that
>>>> >> >>> >> is
>>>> >> >>> >> likely
>>>> >> >>> >> to
>>>> >> >>> >> meet the goals.
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> Rejected Goals:
>>>> >> >>> >>
>>>> >> >>> >> - SIPs are not for detailed design.  Design by committee
>>>> >> >>> >> doesn't
>>>> >> >>> >> work.
>>>> >> >>> >>
>>>> >> >>> >> - SIPs are not for every change.  We dont need that much
>>>> >> >>> >> process.
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> Strategy:
>>>> >> >>> >>
>>>> >> >>> >> My suggestion is outlined as a Spark Improvement Proposal
>>>> >> >>> >> process
>>>> >> >>> >> documented
>>>> >> >>> >> at
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark-improvement-proposals.md
>>>> >> >>> >>
>>>> >> >>> >> Specifics of Jira manipulation are an implementation detail we
>>>> >> >>> >> can
>>>> >> >>> >> figure
>>>> >> >>> >> out.
>>>> >> >>> >>
>>>> >> >>> >> I'm suggesting voting; the need here is for a _clear_ outcome.
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> Rejected Strategies:
>>>> >> >>> >>
>>>> >> >>> >> Having someone who understands the problem implement it first
>>>> >> >>> >> works,
>>>> >> >>> >> but
>>>> >> >>> >> only if significant iteration after user feedback is allowed.
>>>> >> >>> >>
>>>> >> >>> >> Historically this has been problematic due to pressure to
>>>> >> >>> >> limit
>>>> >> >>> >> public
>>>> >> >>> >> api
>>>> >> >>> >> changes.
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> On Fri, Oct 7, 2016 at 5:16 PM, Reynold Xin <[hidden email]>
>>>> >> >>> >> wrote:
>>>> >> >>> >>>
>>>> >> >>> >>> Alright looks like there are quite a bit of support. We
>>>> >> >>> >>> should
>>>> >> >>> >>> wait
>>>> >> >>> >>> to
>>>> >> >>> >>> hear from more people too.
>>>> >> >>> >>>
>>>> >> >>> >>> To push this forward, Cody and I will be working together in
>>>> >> >>> >>> the
>>>> >> >>> >>> next
>>>> >> >>> >>> couple of weeks to come up with a concrete, detailed proposal
>>>> >> >>> >>> on
>>>> >> >>> >>> what
>>>> >> >>> >>> this
>>>> >> >>> >>> entails, and then we can discuss this the specific proposal
>>>> >> >>> >>> as
>>>> >> >>> >>> well.
>>>> >> >>> >>>
>>>> >> >>> >>>
>>>> >> >>> >>> On Fri, Oct 7, 2016 at 2:29 PM, Cody Koeninger <[hidden
>>>> >> >>> >>> email]>
>>>> >> >>> >>> wrote:
>>>> >> >>> >>>>
>>>> >> >>> >>>> Yeah, in case it wasn't clear, I was talking about SIPs for
>>>> >> >>> >>>> major
>>>> >> >>> >>>> user-facing or cross-cutting changes, not minor feature
>>>> >> >>> >>>> adds.
>>>> >> >>> >>>>
>>>> >> >>> >>>> On Fri, Oct 7, 2016 at 3:58 PM, Stavros Kontopoulos
>>>> >> >>> >>>> <[hidden email]> wrote:
>>>> >> >>> >>>>>
>>>> >> >>> >>>>> +1 to the SIP label as long as it does not slow down things
>>>> >> >>> >>>>> and
>>>> >> >>> >>>>> it
>>>> >> >>> >>>>> targets optimizing efforts, coordination etc. For example
>>>> >> >>> >>>>> really
>>>> >> >>> >>>>> small
>>>> >> >>> >>>>> features should not need to go through this process
>>>> >> >>> >>>>> (assuming
>>>> >> >>> >>>>> they
>>>> >> >>> >>>>> dont
>>>> >> >>> >>>>> touch public interfaces)  or re-factorings and hope it will
>>>> >> >>> >>>>> be
>>>> >> >>> >>>>> kept
>>>> >> >>> >>>>> this
>>>> >> >>> >>>>> way. So as a guideline doc should be provided, like in the
>>>> >> >>> >>>>> KIP
>>>> >> >>> >>>>> case.
>>>> >> >>> >>>>>
>>>> >> >>> >>>>> IMHO so far aside from tagging things and linking them
>>>> >> >>> >>>>> elsewhere
>>>> >> >>> >>>>> simply
>>>> >> >>> >>>>> having design docs and prototypes implementations in PRs is
>>>> >> >>> >>>>> not
>>>> >> >>> >>>>> something
>>>> >> >>> >>>>> that has not worked so far. What is really a pain in many
>>>> >> >>> >>>>> projects
>>>> >> >>> >>>>> out there
>>>> >> >>> >>>>> is discontinuity in progress of PRs, missing features, slow
>>>> >> >>> >>>>> reviews
>>>> >> >>> >>>>> which is
>>>> >> >>> >>>>> understandable to some extent... it is not only about Spark
>>>> >> >>> >>>>> but
>>>> >> >>> >>>>> things can
>>>> >> >>> >>>>> be improved for sure for this project in particular as
>>>> >> >>> >>>>> already
>>>> >> >>> >>>>> stated.
>>>> >> >>> >>>>>
>>>> >> >>> >>>>> On Fri, Oct 7, 2016 at 11:14 PM, Cody Koeninger <[hidden
>>>> >> >>> >>>>> email]>
>>>> >> >>> >>>>> wrote:
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> +1 to adding an SIP label and linking it from the website.
>>>> >> >>> >>>>>> I
>>>> >> >>> >>>>>> think
>>>> >> >>> >>>>>> it
>>>> >> >>> >>>>>> needs
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> - template that focuses it towards soliciting user goals /
>>>> >> >>> >>>>>> non
>>>> >> >>> >>>>>> goals
>>>> >> >>> >>>>>> - clear resolution as to which strategy was chosen to
>>>> >> >>> >>>>>> pursue.
>>>> >> >>> >>>>>> I'd
>>>> >> >>> >>>>>> recommend a vote.
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> Matei asked me to clarify what I meant by changing
>>>> >> >>> >>>>>> interfaces,
>>>> >> >>> >>>>>> I
>>>> >> >>> >>>>>> think
>>>> >> >>> >>>>>> it's directly relevant to the SIP idea so I'll clarify
>>>> >> >>> >>>>>> here,
>>>> >> >>> >>>>>> and
>>>> >> >>> >>>>>> split
>>>> >> >>> >>>>>> a thread for the other discussion per Nicholas' request.
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> I meant changing public user interfaces.  I think the
>>>> >> >>> >>>>>> first
>>>> >> >>> >>>>>> design
>>>> >> >>> >>>>>> is
>>>> >> >>> >>>>>> unlikely to be right, because it's done at a time when you
>>>> >> >>> >>>>>> have
>>>> >> >>> >>>>>> the
>>>> >> >>> >>>>>> least information.  As a user, I find it considerably more
>>>> >> >>> >>>>>> frustrating
>>>> >> >>> >>>>>> to be unable to use a tool to get my job done, than I do
>>>> >> >>> >>>>>> having
>>>> >> >>> >>>>>> to
>>>> >> >>> >>>>>> make minor changes to my code in order to take advantage
>>>> >> >>> >>>>>> of
>>>> >> >>> >>>>>> features.
>>>> >> >>> >>>>>> I've seen committers be seriously reluctant to allow
>>>> >> >>> >>>>>> changes to
>>>> >> >>> >>>>>> @experimental code that are needed in order for it to
>>>> >> >>> >>>>>> really
>>>> >> >>> >>>>>> work
>>>> >> >>> >>>>>> right.  You need to be able to iterate, and if people on
>>>> >> >>> >>>>>> both
>>>> >> >>> >>>>>> sides
>>>> >> >>> >>>>>> of
>>>> >> >>> >>>>>> the fence aren't going to respect that some newer apis are
>>>> >> >>> >>>>>> subject
>>>> >> >>> >>>>>> to
>>>> >> >>> >>>>>> change, then why even mark them as such?
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> Ideally a finished SIP should give me a checklist of
>>>> >> >>> >>>>>> things
>>>> >> >>> >>>>>> that
>>>> >> >>> >>>>>> an
>>>> >> >>> >>>>>> implementation must do, and things that it doesn't need to
>>>> >> >>> >>>>>> do.
>>>> >> >>> >>>>>> Contributors/committers should be seriously discouraged
>>>> >> >>> >>>>>> from
>>>> >> >>> >>>>>> putting
>>>> >> >>> >>>>>> out a version 0.1 that doesn't have at least a prototype
>>>> >> >>> >>>>>> implementation of all those things, especially if they're
>>>> >> >>> >>>>>> then
>>>> >> >>> >>>>>> going
>>>> >> >>> >>>>>> to argue against interface changes necessary to get the
>>>> >> >>> >>>>>> the
>>>> >> >>> >>>>>> rest
>>>> >> >>> >>>>>> of
>>>> >> >>> >>>>>> the things done in the 0.2 version.
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> On Fri, Oct 7, 2016 at 2:18 PM, Reynold Xin <[hidden
>>>> >> >>> >>>>>> email]>
>>>> >> >>> >>>>>> wrote:
>>>> >> >>> >>>>>>> I like the lightweight proposal to add a SIP label.
>>>> >> >>> >>>>>>>
>>>> >> >>> >>>>>>> During Spark 2.0 development, Tom (Graves) and I
>>>> >> >>> >>>>>>> suggested
>>>> >> >>> >>>>>>> using
>>>> >> >>> >>>>>>> wiki
>>>> >> >>> >>>>>>> to
>>>> >> >>> >>>>>>> track the list of major changes, but that never really
>>>> >> >>> >>>>>>> materialized
>>>> >> >>> >>>>>>> due to
>>>> >> >>> >>>>>>> the overhead. Adding a SIP label on major JIRAs and then
>>>> >> >>> >>>>>>> link
>>>> >> >>> >>>>>>> to
>>>> >> >>> >>>>>>> them
>>>> >> >>> >>>>>>> prominently on the Spark website makes a lot of sense.
>>>> >> >>> >>>>>>>
>>>> >> >>> >>>>>>>
>>>> >> >>> >>>>>>> On Fri, Oct 7, 2016 at 10:50 AM, Matei Zaharia
>>>> >> >>> >>>>>>> <[hidden email]>
>>>> >> >>> >>>>>>> wrote:
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> For the improvement proposals, I think one major point
>>>> >> >>> >>>>>>>> was to
>>>> >> >>> >>>>>>>> make
>>>> >> >>> >>>>>>>> them
>>>> >> >>> >>>>>>>> really visible to users who are not contributors, so we
>>>> >> >>> >>>>>>>> should
>>>> >> >>> >>>>>>>> do
>>>> >> >>> >>>>>>>> more than
>>>> >> >>> >>>>>>>> sending stuff to dev@. One very lightweight idea is to
>>>> >> >>> >>>>>>>> have a
>>>> >> >>> >>>>>>>> new
>>>> >> >>> >>>>>>>> type of
>>>> >> >>> >>>>>>>> JIRA called a SIP and have a link to a filter that shows
>>>> >> >>> >>>>>>>> all
>>>> >> >>> >>>>>>>> such
>>>> >> >>> >>>>>>>> JIRAs from
>>>> >> >>> >>>>>>>> http://spark.apache.org. I also like the idea of SIP and
>>>> >> >>> >>>>>>>> design
>>>> >> >>> >>>>>>>> doc
>>>> >> >>> >>>>>>>> templates (in fact many projects have them).
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> Matei
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> On Oct 7, 2016, at 10:38 AM, Reynold Xin <[hidden
>>>> >> >>> >>>>>>>> email]>
>>>> >> >>> >>>>>>>> wrote:
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> I called Cody last night and talked about some of the
>>>> >> >>> >>>>>>>> topics
>>>> >> >>> >>>>>>>> in
>>>> >> >>> >>>>>>>> his
>>>> >> >>> >>>>>>>> email.
>>>> >> >>> >>>>>>>> It became clear to me Cody genuinely cares about the
>>>> >> >>> >>>>>>>> project.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> Some of the frustrations come from the success of the
>>>> >> >>> >>>>>>>> project
>>>> >> >>> >>>>>>>> itself
>>>> >> >>> >>>>>>>> becoming very "hot", and it is difficult to get clarity
>>>> >> >>> >>>>>>>> from
>>>> >> >>> >>>>>>>> people
>>>> >> >>> >>>>>>>> who
>>>> >> >>> >>>>>>>> don't dedicate all their time to Spark. In fact, it is
>>>> >> >>> >>>>>>>> in
>>>> >> >>> >>>>>>>> some
>>>> >> >>> >>>>>>>> ways
>>>> >> >>> >>>>>>>> similar
>>>> >> >>> >>>>>>>> to scaling an engineering team in a successful startup:
>>>> >> >>> >>>>>>>> old
>>>> >> >>> >>>>>>>> processes that
>>>> >> >>> >>>>>>>> worked well might not work so well when it gets to a
>>>> >> >>> >>>>>>>> certain
>>>> >> >>> >>>>>>>> size,
>>>> >> >>> >>>>>>>> cultures
>>>> >> >>> >>>>>>>> can get diluted, building culture vs building process,
>>>> >> >>> >>>>>>>> etc.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> I also really like to have a more visible process for
>>>> >> >>> >>>>>>>> larger
>>>> >> >>> >>>>>>>> changes,
>>>> >> >>> >>>>>>>> especially major user facing API changes. Historically
>>>> >> >>> >>>>>>>> we
>>>> >> >>> >>>>>>>> upload
>>>> >> >>> >>>>>>>> design docs
>>>> >> >>> >>>>>>>> for major changes, but it is not always consistent and
>>>> >> >>> >>>>>>>> difficult
>>>> >> >>> >>>>>>>> to
>>>> >> >>> >>>>>>>> quality
>>>> >> >>> >>>>>>>> of the docs, due to the volunteering nature of the
>>>> >> >>> >>>>>>>> organization.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> Some of the more concrete ideas we discussed focus on
>>>> >> >>> >>>>>>>> building a
>>>> >> >>> >>>>>>>> culture
>>>> >> >>> >>>>>>>> to improve clarity:
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> - Process: Large changes should have design docs posted
>>>> >> >>> >>>>>>>> on
>>>> >> >>> >>>>>>>> JIRA.
>>>> >> >>> >>>>>>>> One
>>>> >> >>> >>>>>>>> thing
>>>> >> >>> >>>>>>>> Cody and I didn't discuss but an idea that just came to
>>>> >> >>> >>>>>>>> me is
>>>> >> >>> >>>>>>>> we
>>>> >> >>> >>>>>>>> should
>>>> >> >>> >>>>>>>> create a design doc template for the project and ask
>>>> >> >>> >>>>>>>> everybody
>>>> >> >>> >>>>>>>> to
>>>> >> >>> >>>>>>>> follow.
>>>> >> >>> >>>>>>>> The design doc template should also explicitly list
>>>> >> >>> >>>>>>>> goals and
>>>> >> >>> >>>>>>>> non-goals, to
>>>> >> >>> >>>>>>>> make design doc more consistent.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> - Process: Email dev@ to solicit feedback. We have some
>>>> >> >>> >>>>>>>> this
>>>> >> >>> >>>>>>>> with
>>>> >> >>> >>>>>>>> some
>>>> >> >>> >>>>>>>> changes, but again very inconsistent. Just posting
>>>> >> >>> >>>>>>>> something
>>>> >> >>> >>>>>>>> on
>>>> >> >>> >>>>>>>> JIRA
>>>> >> >>> >>>>>>>> isn't
>>>> >> >>> >>>>>>>> sufficient, because there are simply too many JIRAs and
>>>> >> >>> >>>>>>>> the
>>>> >> >>> >>>>>>>> signal
>>>> >> >>> >>>>>>>> get lost
>>>> >> >>> >>>>>>>> in the noise. While this is generally impossible to
>>>> >> >>> >>>>>>>> enforce
>>>> >> >>> >>>>>>>> because
>>>> >> >>> >>>>>>>> we can't
>>>> >> >>> >>>>>>>> force all volunteers to conform to a process (or they
>>>> >> >>> >>>>>>>> might
>>>> >> >>> >>>>>>>> not
>>>> >> >>> >>>>>>>> even
>>>> >> >>> >>>>>>>> be
>>>> >> >>> >>>>>>>> aware of this),  those who are more familiar with the
>>>> >> >>> >>>>>>>> project
>>>> >> >>> >>>>>>>> can
>>>> >> >>> >>>>>>>> help by
>>>> >> >>> >>>>>>>> emailing the dev@ when they see something that hasn't
>>>> >> >>> >>>>>>>> been.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> - Culture: The design doc author(s) should be open to
>>>> >> >>> >>>>>>>> feedback.
>>>> >> >>> >>>>>>>> A
>>>> >> >>> >>>>>>>> design
>>>> >> >>> >>>>>>>> doc should serve as the base for discussion and is by no
>>>> >> >>> >>>>>>>> means
>>>> >> >>> >>>>>>>> the
>>>> >> >>> >>>>>>>> final
>>>> >> >>> >>>>>>>> design. Of course, this does not mean the author has to
>>>> >> >>> >>>>>>>> accept
>>>> >> >>> >>>>>>>> every
>>>> >> >>> >>>>>>>> feedback. They should also be comfortable accepting /
>>>> >> >>> >>>>>>>> rejecting
>>>> >> >>> >>>>>>>> ideas on
>>>> >> >>> >>>>>>>> technical grounds.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> - Process / Culture: For major ongoing projects, it can
>>>> >> >>> >>>>>>>> be
>>>> >> >>> >>>>>>>> useful
>>>> >> >>> >>>>>>>> to
>>>> >> >>> >>>>>>>> have
>>>> >> >>> >>>>>>>> some monthly Google hangouts that are open to the world.
>>>> >> >>> >>>>>>>> I am
>>>> >> >>> >>>>>>>> actually not
>>>> >> >>> >>>>>>>> sure how well this will work, because of the
>>>> >> >>> >>>>>>>> volunteering
>>>> >> >>> >>>>>>>> nature
>>>> >> >>> >>>>>>>> and
>>>> >> >>> >>>>>>>> we need
>>>> >> >>> >>>>>>>> to adjust for timezones for people across the globe, but
>>>> >> >>> >>>>>>>> it
>>>> >> >>> >>>>>>>> seems
>>>> >> >>> >>>>>>>> worth
>>>> >> >>> >>>>>>>> trying.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> - Culture: Contributors (including committers) should be
>>>> >> >>> >>>>>>>> more
>>>> >> >>> >>>>>>>> direct
>>>> >> >>> >>>>>>>> in
>>>> >> >>> >>>>>>>> setting expectations, including whether they are working
>>>> >> >>> >>>>>>>> on a
>>>> >> >>> >>>>>>>> specific
>>>> >> >>> >>>>>>>> issue, whether they will be working on a specific issue,
>>>> >> >>> >>>>>>>> and
>>>> >> >>> >>>>>>>> whether
>>>> >> >>> >>>>>>>> an
>>>> >> >>> >>>>>>>> issue or pr or jira should be rejected. Most people I
>>>> >> >>> >>>>>>>> know in
>>>> >> >>> >>>>>>>> this
>>>> >> >>> >>>>>>>> community
>>>> >> >>> >>>>>>>> are nice and don't enjoy telling other people no, but it
>>>> >> >>> >>>>>>>> is
>>>> >> >>> >>>>>>>> often
>>>> >> >>> >>>>>>>> more
>>>> >> >>> >>>>>>>> annoying to a contributor to not know anything than
>>>> >> >>> >>>>>>>> getting a
>>>> >> >>> >>>>>>>> no.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> On Fri, Oct 7, 2016 at 10:03 AM, Matei Zaharia
>>>> >> >>> >>>>>>>> <[hidden email]>
>>>> >> >>> >>>>>>>> wrote:
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>> Love the idea of a more visible "Spark Improvement
>>>> >> >>> >>>>>>>>> Proposal"
>>>> >> >>> >>>>>>>>> process that
>>>> >> >>> >>>>>>>>> solicits user input on new APIs. For what it's worth, I
>>>> >> >>> >>>>>>>>> don't
>>>> >> >>> >>>>>>>>> think
>>>> >> >>> >>>>>>>>> committers are trying to minimize their own work --
>>>> >> >>> >>>>>>>>> every
>>>> >> >>> >>>>>>>>> committer
>>>> >> >>> >>>>>>>>> cares
>>>> >> >>> >>>>>>>>> about making the software useful for users. However, it
>>>> >> >>> >>>>>>>>> is
>>>> >> >>> >>>>>>>>> always
>>>> >> >>> >>>>>>>>> hard to
>>>> >> >>> >>>>>>>>> get user input and so it helps to have this kind of
>>>> >> >>> >>>>>>>>> process.
>>>> >> >>> >>>>>>>>> I've
>>>> >> >>> >>>>>>>>> certainly
>>>> >> >>> >>>>>>>>> looked at the *IPs a lot in other software I use just
>>>> >> >>> >>>>>>>>> to see
>>>> >> >>> >>>>>>>>> the
>>>> >> >>> >>>>>>>>> biggest
>>>> >> >>> >>>>>>>>> things on the roadmap.
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>> When you're talking about "changing interfaces", are
>>>> >> >>> >>>>>>>>> you
>>>> >> >>> >>>>>>>>> talking
>>>> >> >>> >>>>>>>>> about
>>>> >> >>> >>>>>>>>> public or internal APIs? I do think many people hate
>>>> >> >>> >>>>>>>>> changing
>>>> >> >>> >>>>>>>>> public APIs
>>>> >> >>> >>>>>>>>> and I actually think that's for the best of the
>>>> >> >>> >>>>>>>>> project.
>>>> >> >>> >>>>>>>>> That's
>>>> >> >>> >>>>>>>>> a
>>>> >> >>> >>>>>>>>> technical
>>>> >> >>> >>>>>>>>> debate, but basically, the worst thing when you're
>>>> >> >>> >>>>>>>>> using a
>>>> >> >>> >>>>>>>>> piece
>>>> >> >>> >>>>>>>>> of
>>>> >> >>> >>>>>>>>> software
>>>> >> >>> >>>>>>>>> is that the developers constantly ask you to rewrite
>>>> >> >>> >>>>>>>>> your
>>>> >> >>> >>>>>>>>> app
>>>> >> >>> >>>>>>>>> to
>>>> >> >>> >>>>>>>>> update to a
>>>> >> >>> >>>>>>>>> new version (and thus benefit from bug fixes, etc). Cue
>>>> >> >>> >>>>>>>>> anyone
>>>> >> >>> >>>>>>>>> who's used
>>>> >> >>> >>>>>>>>> Protobuf, or Guava. The "let's get everyone to change
>>>> >> >>> >>>>>>>>> their
>>>> >> >>> >>>>>>>>> code
>>>> >> >>> >>>>>>>>> this
>>>> >> >>> >>>>>>>>> release" model works well within a single large
>>>> >> >>> >>>>>>>>> company, but
>>>> >> >>> >>>>>>>>> doesn't work
>>>> >> >>> >>>>>>>>> well for a community, which is why nearly all *very*
>>>> >> >>> >>>>>>>>> widely
>>>> >> >>> >>>>>>>>> used
>>>> >> >>> >>>>>>>>> programming
>>>> >> >>> >>>>>>>>> interfaces (I'm talking things like Java standard
>>>> >> >>> >>>>>>>>> library,
>>>> >> >>> >>>>>>>>> Windows
>>>> >> >>> >>>>>>>>> API, etc)
>>>> >> >>> >>>>>>>>> almost *never* break backwards compatibility. All this
>>>> >> >>> >>>>>>>>> is
>>>> >> >>> >>>>>>>>> done
>>>> >> >>> >>>>>>>>> within reason
>>>> >> >>> >>>>>>>>> though, e.g. we do change things in major releases
>>>> >> >>> >>>>>>>>> (2.x,
>>>> >> >>> >>>>>>>>> 3.x,
>>>> >> >>> >>>>>>>>> etc).
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> ---------------------------------------------------------------------
>>>> >> >>> >>>>>> To unsubscribe e-mail: [hidden email]
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>
>>>> >> >>> >>>>>
>>>> >> >>> >>>>>
>>>> >> >>> >>>>> --
>>>> >> >>> >>>>> Stavros Kontopoulos
>>>> >> >>> >>>>> Senior Software Engineer
>>>> >> >>> >>>>> Lightbend, Inc.
>>>> >> >>> >>>>> p:  +30 6977967274
>>>> >> >>> >>>>> e: [hidden email]
>>>> >> >>> >>>>>
>>>> >> >>> >>>>>
>>>> >> >>> >>>>
>>>> >> >>> >>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > ---------------------------------------------------------------------
>>>> >> > To unsubscribe e-mail: [hidden email]
>>>> >> >
>>>> >> >
>>>> >> > ________________________________
>>>> >> >
>>>> >> > If you reply to this email, your message will be added to the
>>>> >> > discussion
>>>> >> > below:
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Improvement-Proposals-tp19268p19359.html
>>>> >> >
>>>> >> > To start a new topic under Apache Spark Developers List, email
>>>> >> > [hidden
>>>> >> > email]
>>>> >> > To unsubscribe from Apache Spark Developers List, click here.
>>>> >> > NAML
>>>> >> >
>>>> >> >
>>>> >> > ________________________________
>>>> >> > View this message in context: RE: Spark Improvement Proposals
>>>> >> > Sent from the Apache Spark Developers List mailing list archive at
>>>> >> > Nabble.com.
>>>> >>
>>>> >> ---------------------------------------------------------------------
>>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Ryan Blue
>>>> > Software Engineer
>>>> > Netflix
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>
>


---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark Improvement Proposals

Reply via email to