Re: Proposal for Spark Release Strategy

Sandy Ryza Thu, 06 Feb 2014 10:22:25 -0800

Thanks for all this Patrick.

I like Heiko's proposal that requires every pull request to reference a
JIRA.  This is how things are done in Hadoop and it makes it much easier
to, for example, find out whether an issue you came across when googling
for an error is in a release.


I agree with Mridul about binary compatibility.  It can be a dealbreaker
for organizations that are considering an upgrade. The two ways I'm aware
of that cause binary compatibility are scala version upgrades and messing
around with inheritance.  Are these not avoidable at least for minor
releases?

-Sandy




On Thu, Feb 6, 2014 at 12:49 AM, Mridul Muralidharan <[email protected]>wrote:

> The reason I explicitly mentioned about binary compatibility was
> because it was sort of hand waved in the proposal as good to have.
> My understanding is that scala does make it painful to ensure binary
> compatibility - but stability of interfaces is vital to ensure
> dependable platforms.
> Recompilation might be a viable option for developers - not for users.
>
> Regards,
> Mridul
>
>
> On Thu, Feb 6, 2014 at 12:08 PM, Patrick Wendell <[email protected]>
> wrote:
> > If people feel that merging the intermediate SNAPSHOT number is
> > significant, let's just defer merging that until this discussion
> > concludes.
> >
> > That said - the decision to settle on 1.0 for the next release is not
> > just because it happens to come after 0.9. It's a conscientious
> > decision based on the development of the project to this point. A
> > major focus of the 0.9 release was tying off loose ends in terms of
> > backwards compatibility (e.g. spark configuration). There was some
> > discussion back then of maybe cutting a 1.0 release but the decision
> > was deferred until after 0.9.
> >
> > @mridul - pleas see the original post for discussion about binary
> compatibility.
> >
> > On Wed, Feb 5, 2014 at 10:20 PM, Andy Konwinski <[email protected]>
> wrote:
> >> +1 for 0.10.0 now with the option to switch to 1.0.0 after further
> >> discussion.
> >> On Feb 5, 2014 9:53 PM, "Andrew Ash" <[email protected]> wrote:
> >>
> >>> Agree on timeboxed releases as well.
> >>>
> >>> Is there a vision for where we want to be as a project before
> declaring the
> >>> first 1.0 release?  While we're in the 0.x days per semver we can break
> >>> backcompat at will (though we try to avoid it where possible), and that
> >>> luxury goes away with 1.x  I just don't want to release a 1.0 simply
> >>> because it seems to follow after 0.9 rather than making an intentional
> >>> decision that we're at the point where we can stand by the current
> APIs and
> >>> binary compatibility for the next year or so of the major release.
> >>>
> >>> Until that decision is made as a group I'd rather we do an immediate
> >>> version bump to 0.10.0-SNAPSHOT and then if discussion warrants it
> later,
> >>> replace that with 1.0.0-SNAPSHOT.  It's very easy to go from 0.10 to
> 1.0
> >>> but not the other way around.
> >>>
> >>> https://github.com/apache/incubator-spark/pull/542
> >>>
> >>> Cheers!
> >>> Andrew
> >>>
> >>>
> >>> On Wed, Feb 5, 2014 at 9:49 PM, Heiko Braun <[email protected]
> >>> >wrote:
> >>>
> >>> > +1 on time boxed releases and compatibility guidelines
> >>> >
> >>> >
> >>> > > Am 06.02.2014 um 01:20 schrieb Patrick Wendell <[email protected]
> >:
> >>> > >
> >>> > > Hi Everyone,
> >>> > >
> >>> > > In an effort to coordinate development amongst the growing list of
> >>> > > Spark contributors, I've taken some time to write up a proposal to
> >>> > > formalize various pieces of the development process. The next
> release
> >>> > > of Spark will likely be Spark 1.0.0, so this message is intended in
> >>> > > part to coordinate the release plan for 1.0.0 and future releases.
> >>> > > I'll post this on the wiki after discussing it on this thread as
> >>> > > tentative project guidelines.
> >>> > >
> >>> > > == Spark Release Structure ==
> >>> > > Starting with Spark 1.0.0, the Spark project will follow the
> semantic
> >>> > > versioning guidelines (http://semver.org/) with a few deviations.
> >>> > > These small differences account for Spark's nature as a
> multi-module
> >>> > > project.
> >>> > >
> >>> > > Each Spark release will be versioned:
> >>> > > [MAJOR].[MINOR].[MAINTENANCE]
> >>> > >
> >>> > > All releases with the same major version number will have API
> >>> > > compatibility, defined as [1]. Major version numbers will remain
> >>> > > stable over long periods of time. For instance, 1.X.Y may last 1
> year
> >>> > > or more.
> >>> > >
> >>> > > Minor releases will typically contain new features and
> improvements.
> >>> > > The target frequency for minor releases is every 3-4 months. One
> >>> > > change we'd like to make is to announce fixed release dates and
> merge
> >>> > > windows for each release, to facilitate coordination. Each minor
> >>> > > release will have a merge window where new patches can be merged,
> a QA
> >>> > > window when only fixes can be merged, then a final period where
> voting
> >>> > > occurs on release candidates. These windows will be announced
> >>> > > immediately after the previous minor release to give people plenty
> of
> >>> > > time, and over time, we might make the whole release process more
> >>> > > regular (similar to Ubuntu). At the bottom of this document is an
> >>> > > example window for the 1.0.0 release.
> >>> > >
> >>> > > Maintenance releases will occur more frequently and depend on
> specific
> >>> > > patches introduced (e.g. bug fixes) and their urgency. In general
> >>> > > these releases are designed to patch bugs. However, higher level
> >>> > > libraries may introduce small features, such as a new algorithm,
> >>> > > provided they are entirely additive and isolated from existing code
> >>> > > paths. Spark core may not introduce any features.
> >>> > >
> >>> > > When new components are added to Spark, they may initially be
> marked
> >>> > > as "alpha". Alpha components do not have to abide by the above
> >>> > > guidelines, however, to the maximum extent possible, they should
> try
> >>> > > to. Once they are marked "stable" they have to follow these
> >>> > > guidelines. At present, GraphX is the only alpha component of
> Spark.
> >>> > >
> >>> > > [1] API compatibility:
> >>> > >
> >>> > > An API is any public class or interface exposed in Spark that is
> not
> >>> > > marked as semi-private or experimental. Release A is API compatible
> >>> > > with release B if code compiled against release A *compiles
> cleanly*
> >>> > > against B. This does not guarantee that a compiled application
> that is
> >>> > > linked against version A will link cleanly against version B
> without
> >>> > > re-compiling. Link-level compatibility is something we'll try to
> >>> > > guarantee that as well, and we might make it a requirement in the
> >>> > > future, but challenges with things like Scala versions have made
> this
> >>> > > difficult to guarantee in the past.
> >>> > >
> >>> > > == Merging Pull Requests ==
> >>> > > To merge pull requests, committers are encouraged to use this tool
> [2]
> >>> > > to collapse the request into one commit rather than manually
> >>> > > performing git merges. It will also format the commit message
> nicely
> >>> > > in a way that can be easily parsed later when writing credits.
> >>> > > Currently it is maintained in a public utility repository, but
> we'll
> >>> > > merge it into mainline Spark soon.
> >>> > >
> >>> > > [2]
> >>> >
> https://github.com/pwendell/spark-utils/blob/master/apache_pr_merge.py
> >>> > >
> >>> > > == Tentative Release Window for 1.0.0 ==
> >>> > > Feb 1st - April 1st: General development
> >>> > > April 1st: Code freeze for new features
> >>> > > April 15th: RC1
> >>> > >
> >>> > > == Deviations ==
> >>> > > For now, the proposal is to consider these tentative guidelines. We
> >>> > > can vote to formalize these as project rules at a later time after
> >>> > > some experience working with them. Once formalized, any deviation
> to
> >>> > > these guidelines will be subject to a lazy majority vote.
> >>> > >
> >>> > > - Patrick
> >>> >
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> Groups
> >>> "Unofficial Apache Spark Dev Mailing List Mirror" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> an
> >>> email to [email protected].
> >>> For more options, visit https://groups.google.com/groups/opt_out.
> >>>
>

Re: Proposal for Spark Release Strategy

Reply via email to