+1 for 0.10.0. It would give more time to study things (such as the new SparkConf) and let the community decide if any breaking API changes are needed.
Also, a +1 for minor revisions not breaking code compatibility, including Scala versions. (I guess this would mean that 1.x would stay on Scala 2.10.x) On Thu, Feb 6, 2014 at 11:05 AM, Sandy Ryza <[email protected]> wrote: > Bleh, hit send to early again. My second paragraph was to argue for 1.0.0 > instead of 0.10.0, not to hammer on the binary compatibility point. > > > On Thu, Feb 6, 2014 at 11:04 AM, Sandy Ryza <[email protected]> wrote: > >> *Would it make sense to put in something that strongly discourages binary >> incompatible changes when possible? >> >> >> On Thu, Feb 6, 2014 at 11:03 AM, Sandy Ryza <[email protected]>wrote: >> >>> Not codifying binary compatibility as a hard rule sounds fine to me. >>> Would it make sense to put something in that . I.e. avoid making needless >>> changes to class hierarchies. >>> >>> Whether Spark considers itself stable or not, users are beginning to >>> treat it so. A responsible project will acknowledge this and provide the >>> stability needed by its user base. I think some projects have made the >>> mistake of waiting too long to release a 1.0.0. It allows them to put off >>> making the hard decisions, but users and downstream projects suffer. >>> >>> If Spark needs to go through dramatic changes, there's always the option >>> of a 2.0.0 that allows for this. >>> >>> -Sandy >>> >>> >>> >>> On Thu, Feb 6, 2014 at 10:56 AM, Matei Zaharia >>> <[email protected]>wrote: >>> >>>> I think it's important to do 1.0 next. The project has been around for 4 >>>> years, and I'd be comfortable maintaining the current codebase for a long >>>> time in an API and binary compatible way through 1.x releases. Over the >>>> past 4 years we haven't actually had major changes to the user-facing API >>>> -- >>>> the only ones were changing the package to org.apache.spark, and upgrading >>>> the Scala version. I'd be okay leaving 1.x to always use Scala 2.10 for >>>> example, or later cross-building it for Scala 2.11. Updating to 1.0 says >>>> two things: it tells users that they can be confident that version will be >>>> maintained for a long time, which we absolutely want to do, and it lets >>>> outsiders see that the project is now fairly mature (for many people, >>>> pre-1.0 might still cause them not to try it). I think both are good for >>>> the community. >>>> >>>> Regarding binary compatibility, I agree that it's what we should strive >>>> for, but it just seems premature to codify now. Let's see how it works >>>> between, say, 1.0 and 1.1, and then we can codify it. >>>> >>>> Matei >>>> >>>> On Feb 6, 2014, at 10:43 AM, Henry Saputra <[email protected]> >>>> wrote: >>>> >>>> > Thanks Patick to initiate the discussion about next road map for >>>> Apache Spark. >>>> > >>>> > I am +1 for 0.10.0 for next version. >>>> > >>>> > It will give us as community some time to digest the process and the >>>> > vision and make adjustment accordingly. >>>> > >>>> > Release a 1.0.0 is a huge milestone and if we do need to break API >>>> > somehow or modify internal behavior dramatically we could take >>>> > advantage to release 1.0.0 as good step to go to. >>>> > >>>> > >>>> > - Henry >>>> > >>>> > >>>> > >>>> > On Wed, Feb 5, 2014 at 9:52 PM, Andrew Ash <[email protected]> >>>> wrote: >>>> >> Agree on timeboxed releases as well. >>>> >> >>>> >> Is there a vision for where we want to be as a project before >>>> declaring the >>>> >> first 1.0 release? While we're in the 0.x days per semver we can >>>> break >>>> >> backcompat at will (though we try to avoid it where possible), and >>>> that >>>> >> luxury goes away with 1.x I just don't want to release a 1.0 simply >>>> >> because it seems to follow after 0.9 rather than making an intentional >>>> >> decision that we're at the point where we can stand by the current >>>> APIs and >>>> >> binary compatibility for the next year or so of the major release. >>>> >> >>>> >> Until that decision is made as a group I'd rather we do an immediate >>>> >> version bump to 0.10.0-SNAPSHOT and then if discussion warrants it >>>> later, >>>> >> replace that with 1.0.0-SNAPSHOT. It's very easy to go from 0.10 to >>>> 1.0 >>>> >> but not the other way around. >>>> >> >>>> >> https://github.com/apache/incubator-spark/pull/542 >>>> >> >>>> >> Cheers! >>>> >> Andrew >>>> >> >>>> >> >>>> >> On Wed, Feb 5, 2014 at 9:49 PM, Heiko Braun <[email protected] >>>> >wrote: >>>> >> >>>> >>> +1 on time boxed releases and compatibility guidelines >>>> >>> >>>> >>> >>>> >>>> Am 06.02.2014 um 01:20 schrieb Patrick Wendell <[email protected] >>>> >: >>>> >>>> >>>> >>>> Hi Everyone, >>>> >>>> >>>> >>>> In an effort to coordinate development amongst the growing list of >>>> >>>> Spark contributors, I've taken some time to write up a proposal to >>>> >>>> formalize various pieces of the development process. The next >>>> release >>>> >>>> of Spark will likely be Spark 1.0.0, so this message is intended in >>>> >>>> part to coordinate the release plan for 1.0.0 and future releases. >>>> >>>> I'll post this on the wiki after discussing it on this thread as >>>> >>>> tentative project guidelines. >>>> >>>> >>>> >>>> == Spark Release Structure == >>>> >>>> Starting with Spark 1.0.0, the Spark project will follow the >>>> semantic >>>> >>>> versioning guidelines (http://semver.org/) with a few deviations. >>>> >>>> These small differences account for Spark's nature as a multi-module >>>> >>>> project. >>>> >>>> >>>> >>>> Each Spark release will be versioned: >>>> >>>> [MAJOR].[MINOR].[MAINTENANCE] >>>> >>>> >>>> >>>> All releases with the same major version number will have API >>>> >>>> compatibility, defined as [1]. Major version numbers will remain >>>> >>>> stable over long periods of time. For instance, 1.X.Y may last 1 >>>> year >>>> >>>> or more. >>>> >>>> >>>> >>>> Minor releases will typically contain new features and improvements. >>>> >>>> The target frequency for minor releases is every 3-4 months. One >>>> >>>> change we'd like to make is to announce fixed release dates and >>>> merge >>>> >>>> windows for each release, to facilitate coordination. Each minor >>>> >>>> release will have a merge window where new patches can be merged, a >>>> QA >>>> >>>> window when only fixes can be merged, then a final period where >>>> voting >>>> >>>> occurs on release candidates. These windows will be announced >>>> >>>> immediately after the previous minor release to give people plenty >>>> of >>>> >>>> time, and over time, we might make the whole release process more >>>> >>>> regular (similar to Ubuntu). At the bottom of this document is an >>>> >>>> example window for the 1.0.0 release. >>>> >>>> >>>> >>>> Maintenance releases will occur more frequently and depend on >>>> specific >>>> >>>> patches introduced (e.g. bug fixes) and their urgency. In general >>>> >>>> these releases are designed to patch bugs. However, higher level >>>> >>>> libraries may introduce small features, such as a new algorithm, >>>> >>>> provided they are entirely additive and isolated from existing code >>>> >>>> paths. Spark core may not introduce any features. >>>> >>>> >>>> >>>> When new components are added to Spark, they may initially be marked >>>> >>>> as "alpha". Alpha components do not have to abide by the above >>>> >>>> guidelines, however, to the maximum extent possible, they should try >>>> >>>> to. Once they are marked "stable" they have to follow these >>>> >>>> guidelines. At present, GraphX is the only alpha component of Spark. >>>> >>>> >>>> >>>> [1] API compatibility: >>>> >>>> >>>> >>>> An API is any public class or interface exposed in Spark that is not >>>> >>>> marked as semi-private or experimental. Release A is API compatible >>>> >>>> with release B if code compiled against release A *compiles cleanly* >>>> >>>> against B. This does not guarantee that a compiled application that >>>> is >>>> >>>> linked against version A will link cleanly against version B without >>>> >>>> re-compiling. Link-level compatibility is something we'll try to >>>> >>>> guarantee that as well, and we might make it a requirement in the >>>> >>>> future, but challenges with things like Scala versions have made >>>> this >>>> >>>> difficult to guarantee in the past. >>>> >>>> >>>> >>>> == Merging Pull Requests == >>>> >>>> To merge pull requests, committers are encouraged to use this tool >>>> [2] >>>> >>>> to collapse the request into one commit rather than manually >>>> >>>> performing git merges. It will also format the commit message nicely >>>> >>>> in a way that can be easily parsed later when writing credits. >>>> >>>> Currently it is maintained in a public utility repository, but we'll >>>> >>>> merge it into mainline Spark soon. >>>> >>>> >>>> >>>> [2] >>>> >>> >>>> https://github.com/pwendell/spark-utils/blob/master/apache_pr_merge.py >>>> >>>> >>>> >>>> == Tentative Release Window for 1.0.0 == >>>> >>>> Feb 1st - April 1st: General development >>>> >>>> April 1st: Code freeze for new features >>>> >>>> April 15th: RC1 >>>> >>>> >>>> >>>> == Deviations == >>>> >>>> For now, the proposal is to consider these tentative guidelines. We >>>> >>>> can vote to formalize these as project rules at a later time after >>>> >>>> some experience working with them. Once formalized, any deviation to >>>> >>>> these guidelines will be subject to a lazy majority vote. >>>> >>>> >>>> >>>> - Patrick >>>> >>> >>>> >>>> >>> >> -- -- Evan Chan Staff Engineer [email protected] |
