On 17-May-2014 11:40 pm, "Mark Hamstra" <[email protected]> wrote: > > That is a past issue that we don't need to be re-opening now. The present
Huh ? If we need to revisit based on changed circumstances, we must - the scope of changes introduced in this release was definitely not anticipated when 1.0 vs 0.10 discussion happened. If folks are worried about stability of core; it is a valid concern IMO. Having said that, I am still ok with going to 1.0; but if a conversation starts about need for 1.0 vs going to 0.10 I want to hear more and possibly allay the concerns and not try to muzzle the discussion. Regards Mridul > issue, and what I am asking, is which pending bug fixes does anyone > anticipate will require breaking the public API guaranteed in rc9 > > > On Sat, May 17, 2014 at 9:44 AM, Mridul Muralidharan <[email protected] >wrote: > > > We made incompatible api changes whose impact we don't know yet completely > > : both from implementation and usage point of view. > > > > We had the option of getting real-world feedback from the user community if > > we had gone to 0.10 but the spark developers seemed to be in a hurry to get > > to 1.0 - so I made my opinion known but left it to the wisdom of larger > > group of committers to decide ... I did not think it was critical enough to > > do a binding -1 on. > > > > Regards > > Mridul > > On 17-May-2014 9:43 pm, "Mark Hamstra" <[email protected]> wrote: > > > > > Which of the unresolved bugs in spark-core do you think will require an > > > API-breaking change to fix? If there are none of those, then we are > > still > > > essentially on track for a 1.0.0 release. > > > > > > The number of contributions and pace of change now is quite high, but I > > > don't think that waiting for the pace to slow before releasing 1.0 is > > > viable. If Spark's short history is any guide to its near future, the > > pace > > > will not slow by any significant amount for any noteworthy length of > > time, > > > but rather will continue to increase. What we need to be aiming for, I > > > think, is to have the great majority of those new contributions being > > made > > > to MLLlib, GraphX, SparkSQL and other areas of the code that we have > > > clearly marked as not frozen in 1.x. I think we are already seeing that, > > > but if I am just not recognizing breakage of our semantic versioning > > > guarantee that will be forced on us by some pending changes, now would > > be a > > > good time to set me straight. > > > > > > > > > On Sat, May 17, 2014 at 4:26 AM, Mridul Muralidharan <[email protected] > > > >wrote: > > > > > > > I had echoed similar sentiments a while back when there was a > > discussion > > > > around 0.10 vs 1.0 ... I would have preferred 0.10 to stabilize the api > > > > changes, add missing functionality, go through a hardening release > > before > > > > 1.0 > > > > > > > > But the community preferred a 1.0 :-) > > > > > > > > Regards, > > > > Mridul > > > > > > > > On 17-May-2014 3:19 pm, "Sean Owen" <[email protected]> wrote: > > > > > > > > > > On this note, non-binding commentary: > > > > > > > > > > Releases happen in local minima of change, usually created by > > > > > internally enforced code freeze. Spark is incredibly busy now due to > > > > > external factors -- recently a TLP, recently discovered by a large > > new > > > > > audience, ease of contribution enabled by Github. It's getting like > > > > > the first year of mainstream battle-testing in a month. It's been > > very > > > > > hard to freeze anything! I see a number of non-trivial issues being > > > > > reported, and I don't think it has been possible to triage all of > > > > > them, even. > > > > > > > > > > Given the high rate of change, my instinct would have been to release > > > > > 0.10.0 now. But won't it always be very busy? I do think the rate of > > > > > significant issues will slow down. > > > > > > > > > > Version ain't nothing but a number, but if it has any meaning it's > > the > > > > > semantic versioning meaning. 1.0 imposes extra handicaps around > > > > > striving to maintain backwards-compatibility. That may end up being > > > > > bent to fit in important changes that are going to be required in > > this > > > > > continuing period of change. Hadoop does this all the time > > > > > unfortunately and gets away with it, I suppose -- minor version > > > > > releases are really major. (On the other extreme, HBase is at 0.98 > > and > > > > > quite production-ready.) > > > > > > > > > > Just consider this a second vote for focus on fixes and 1.0.x rather > > > > > than new features and 1.x. I think there are a few steps that could > > > > > streamline triage of this flood of contributions, and make all of > > this > > > > > easier, but that's for another thread. > > > > > > > > > > > > > > > On Fri, May 16, 2014 at 8:50 PM, Mark Hamstra < > > [email protected] > > > > > > > > wrote: > > > > > > +1, but just barely. We've got quite a number of outstanding bugs > > > > > > identified, and many of them have fixes in progress. I'd hate to > > see > > > > those > > > > > > efforts get lost in a post-1.0.0 flood of new features targeted at > > > > 1.1.0 -- > > > > > > in other words, I'd like to see 1.0.1 retain a high priority > > relative > > > > to > > > > > > 1.1.0. > > > > > > > > > > > > Looking through the unresolved JIRAs, it doesn't look like any of > > the > > > > > > identified bugs are show-stoppers or strictly regressions > > (although I > > > > will > > > > > > note that one that I have in progress, SPARK-1749, is a bug that we > > > > > > introduced with recent work -- it's not strictly a regression > > because > > > > we > > > > > > had equally bad but different behavior when the DAGScheduler > > > exceptions > > > > > > weren't previously being handled at all vs. being slightly > > > mis-handled > > > > > > now), so I'm not currently seeing a reason not to release. > > > > > > > > >
