Re: Proposal for Spark Release Strategy

Reynold Xin Thu, 06 Feb 2014 12:17:14 -0800

+1 for 1.0


The point of 1.0 is for us to self-enforce API compatibility in the context
of longer term support. If we continue down the 0.xx road, we will always
have excuse for breaking APIs. That said, a major focus of 0.9 and some of
the work that are happening for 1.0 (e.g. configuration, Java 8 closure
support, security) are for better API compatibility support in 1.x releases.

While not perfect, Spark as is is already more mature than many (ASF)
projects that are versioned 1.x, 2.x, or even 10.x. Software releases are
always a moving target. 1.0 doesn't mean it is "perfect" and "final". The
project will still evolve.




On Thu, Feb 6, 2014 at 11:54 AM, Evan Chan <[email protected]> wrote:

> +1 for 0.10.0.
>
> It would give more time to study things (such as the new SparkConf)
> and let the community decide if any breaking API changes are needed.
>
> Also, a +1 for minor revisions not breaking code compatibility,
> including Scala versions.   (I guess this would mean that 1.x would
> stay on Scala 2.10.x)
>
> On Thu, Feb 6, 2014 at 11:05 AM, Sandy Ryza <[email protected]>
> wrote:
> > Bleh, hit send to early again.  My second paragraph was to argue for
> 1.0.0
> > instead of 0.10.0, not to hammer on the binary compatibility point.
> >
> >
> > On Thu, Feb 6, 2014 at 11:04 AM, Sandy Ryza <[email protected]>
> wrote:
> >
> >> *Would it make sense to put in something that strongly discourages
> binary
> >> incompatible changes when possible?
> >>
> >>
> >> On Thu, Feb 6, 2014 at 11:03 AM, Sandy Ryza <[email protected]
> >wrote:
> >>
> >>> Not codifying binary compatibility as a hard rule sounds fine to me.
> >>>  Would it make sense to put something in that . I.e. avoid making
> needless
> >>> changes to class hierarchies.
> >>>
> >>> Whether Spark considers itself stable or not, users are beginning to
> >>> treat it so.  A responsible project will acknowledge this and provide
> the
> >>> stability needed by its user base.  I think some projects have made the
> >>> mistake of waiting too long to release a 1.0.0.  It allows them to put
> off
> >>> making the hard decisions, but users and downstream projects suffer.
> >>>
> >>> If Spark needs to go through dramatic changes, there's always the
> option
> >>> of a 2.0.0 that allows for this.
> >>>
> >>> -Sandy
> >>>
> >>>
> >>>
> >>> On Thu, Feb 6, 2014 at 10:56 AM, Matei Zaharia <
> [email protected]>wrote:
> >>>
> >>>> I think it's important to do 1.0 next. The project has been around
> for 4
> >>>> years, and I'd be comfortable maintaining the current codebase for a
> long
> >>>> time in an API and binary compatible way through 1.x releases. Over
> the
> >>>> past 4 years we haven't actually had major changes to the user-facing
> API --
> >>>> the only ones were changing the package to org.apache.spark, and
> upgrading
> >>>> the Scala version. I'd be okay leaving 1.x to always use Scala 2.10
> for
> >>>> example, or later cross-building it for Scala 2.11. Updating to 1.0
> says
> >>>> two things: it tells users that they can be confident that version
> will be
> >>>> maintained for a long time, which we absolutely want to do, and it
> lets
> >>>> outsiders see that the project is now fairly mature (for many people,
> >>>> pre-1.0 might still cause them not to try it). I think both are good
> for
> >>>> the community.
> >>>>
> >>>> Regarding binary compatibility, I agree that it's what we should
> strive
> >>>> for, but it just seems premature to codify now. Let's see how it works
> >>>> between, say, 1.0 and 1.1, and then we can codify it.
> >>>>
> >>>> Matei
> >>>>
> >>>> On Feb 6, 2014, at 10:43 AM, Henry Saputra <[email protected]>
> >>>> wrote:
> >>>>
> >>>> > Thanks Patick to initiate the discussion about next road map for
> >>>> Apache Spark.
> >>>> >
> >>>> > I am +1 for 0.10.0 for next version.
> >>>> >
> >>>> > It will give us as community some time to digest the process and the
> >>>> > vision and make adjustment accordingly.
> >>>> >
> >>>> > Release a 1.0.0 is a huge milestone and if we do need to break API
> >>>> > somehow or modify internal behavior dramatically we could take
> >>>> > advantage to release 1.0.0 as good step to go to.
> >>>> >
> >>>> >
> >>>> > - Henry
> >>>> >
> >>>> >
> >>>> >
> >>>> > On Wed, Feb 5, 2014 at 9:52 PM, Andrew Ash <[email protected]>
> >>>> wrote:
> >>>> >> Agree on timeboxed releases as well.
> >>>> >>
> >>>> >> Is there a vision for where we want to be as a project before
> >>>> declaring the
> >>>> >> first 1.0 release?  While we're in the 0.x days per semver we can
> >>>> break
> >>>> >> backcompat at will (though we try to avoid it where possible), and
> >>>> that
> >>>> >> luxury goes away with 1.x  I just don't want to release a 1.0
> simply
> >>>> >> because it seems to follow after 0.9 rather than making an
> intentional
> >>>> >> decision that we're at the point where we can stand by the current
> >>>> APIs and
> >>>> >> binary compatibility for the next year or so of the major release.
> >>>> >>
> >>>> >> Until that decision is made as a group I'd rather we do an
> immediate
> >>>> >> version bump to 0.10.0-SNAPSHOT and then if discussion warrants it
> >>>> later,
> >>>> >> replace that with 1.0.0-SNAPSHOT.  It's very easy to go from 0.10
> to
> >>>> 1.0
> >>>> >> but not the other way around.
> >>>> >>
> >>>> >> https://github.com/apache/incubator-spark/pull/542
> >>>> >>
> >>>> >> Cheers!
> >>>> >> Andrew
> >>>> >>
> >>>> >>
> >>>> >> On Wed, Feb 5, 2014 at 9:49 PM, Heiko Braun <
> [email protected]
> >>>> >wrote:
> >>>> >>
> >>>> >>> +1 on time boxed releases and compatibility guidelines
> >>>> >>>
> >>>> >>>
> >>>> >>>> Am 06.02.2014 um 01:20 schrieb Patrick Wendell <
> [email protected]
> >>>> >:
> >>>> >>>>
> >>>> >>>> Hi Everyone,
> >>>> >>>>
> >>>> >>>> In an effort to coordinate development amongst the growing list
> of
> >>>> >>>> Spark contributors, I've taken some time to write up a proposal
> to
> >>>> >>>> formalize various pieces of the development process. The next
> >>>> release
> >>>> >>>> of Spark will likely be Spark 1.0.0, so this message is intended
> in
> >>>> >>>> part to coordinate the release plan for 1.0.0 and future
> releases.
> >>>> >>>> I'll post this on the wiki after discussing it on this thread as
> >>>> >>>> tentative project guidelines.
> >>>> >>>>
> >>>> >>>> == Spark Release Structure ==
> >>>> >>>> Starting with Spark 1.0.0, the Spark project will follow the
> >>>> semantic
> >>>> >>>> versioning guidelines (http://semver.org/) with a few
> deviations.
> >>>> >>>> These small differences account for Spark's nature as a
> multi-module
> >>>> >>>> project.
> >>>> >>>>
> >>>> >>>> Each Spark release will be versioned:
> >>>> >>>> [MAJOR].[MINOR].[MAINTENANCE]
> >>>> >>>>
> >>>> >>>> All releases with the same major version number will have API
> >>>> >>>> compatibility, defined as [1]. Major version numbers will remain
> >>>> >>>> stable over long periods of time. For instance, 1.X.Y may last 1
> >>>> year
> >>>> >>>> or more.
> >>>> >>>>
> >>>> >>>> Minor releases will typically contain new features and
> improvements.
> >>>> >>>> The target frequency for minor releases is every 3-4 months. One
> >>>> >>>> change we'd like to make is to announce fixed release dates and
> >>>> merge
> >>>> >>>> windows for each release, to facilitate coordination. Each minor
> >>>> >>>> release will have a merge window where new patches can be
> merged, a
> >>>> QA
> >>>> >>>> window when only fixes can be merged, then a final period where
> >>>> voting
> >>>> >>>> occurs on release candidates. These windows will be announced
> >>>> >>>> immediately after the previous minor release to give people
> plenty
> >>>> of
> >>>> >>>> time, and over time, we might make the whole release process more
> >>>> >>>> regular (similar to Ubuntu). At the bottom of this document is an
> >>>> >>>> example window for the 1.0.0 release.
> >>>> >>>>
> >>>> >>>> Maintenance releases will occur more frequently and depend on
> >>>> specific
> >>>> >>>> patches introduced (e.g. bug fixes) and their urgency. In general
> >>>> >>>> these releases are designed to patch bugs. However, higher level
> >>>> >>>> libraries may introduce small features, such as a new algorithm,
> >>>> >>>> provided they are entirely additive and isolated from existing
> code
> >>>> >>>> paths. Spark core may not introduce any features.
> >>>> >>>>
> >>>> >>>> When new components are added to Spark, they may initially be
> marked
> >>>> >>>> as "alpha". Alpha components do not have to abide by the above
> >>>> >>>> guidelines, however, to the maximum extent possible, they should
> try
> >>>> >>>> to. Once they are marked "stable" they have to follow these
> >>>> >>>> guidelines. At present, GraphX is the only alpha component of
> Spark.
> >>>> >>>>
> >>>> >>>> [1] API compatibility:
> >>>> >>>>
> >>>> >>>> An API is any public class or interface exposed in Spark that is
> not
> >>>> >>>> marked as semi-private or experimental. Release A is API
> compatible
> >>>> >>>> with release B if code compiled against release A *compiles
> cleanly*
> >>>> >>>> against B. This does not guarantee that a compiled application
> that
> >>>> is
> >>>> >>>> linked against version A will link cleanly against version B
> without
> >>>> >>>> re-compiling. Link-level compatibility is something we'll try to
> >>>> >>>> guarantee that as well, and we might make it a requirement in the
> >>>> >>>> future, but challenges with things like Scala versions have made
> >>>> this
> >>>> >>>> difficult to guarantee in the past.
> >>>> >>>>
> >>>> >>>> == Merging Pull Requests ==
> >>>> >>>> To merge pull requests, committers are encouraged to use this
> tool
> >>>> [2]
> >>>> >>>> to collapse the request into one commit rather than manually
> >>>> >>>> performing git merges. It will also format the commit message
> nicely
> >>>> >>>> in a way that can be easily parsed later when writing credits.
> >>>> >>>> Currently it is maintained in a public utility repository, but
> we'll
> >>>> >>>> merge it into mainline Spark soon.
> >>>> >>>>
> >>>> >>>> [2]
> >>>> >>>
> >>>>
> https://github.com/pwendell/spark-utils/blob/master/apache_pr_merge.py
> >>>> >>>>
> >>>> >>>> == Tentative Release Window for 1.0.0 ==
> >>>> >>>> Feb 1st - April 1st: General development
> >>>> >>>> April 1st: Code freeze for new features
> >>>> >>>> April 15th: RC1
> >>>> >>>>
> >>>> >>>> == Deviations ==
> >>>> >>>> For now, the proposal is to consider these tentative guidelines.
> We
> >>>> >>>> can vote to formalize these as project rules at a later time
> after
> >>>> >>>> some experience working with them. Once formalized, any
> deviation to
> >>>> >>>> these guidelines will be subject to a lazy majority vote.
> >>>> >>>>
> >>>> >>>> - Patrick
> >>>> >>>
> >>>>
> >>>>
> >>>
> >>
>
>
>
> --
> --
> Evan Chan
> Staff Engineer
> [email protected]  |
>

Re: Proposal for Spark Release Strategy

Reply via email to