Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-12 Thread Jungtaek Lim
Xiao, thanks for the proposal and willingness to lead the effort! I feel that it's still a bit different from what I've proposed. What I'm proposing is closer to enforce discussion if the change proposes new public API or brings breaking change. It's good that we add the section "Does this PR

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-08 Thread Dongjoon Hyun
Thank you all. Especially, the Audit efforts. Until now, the whole community has been working together in the same direction with the existing policy. It is always good. Since it seems that we are considering to have a new direction, I created an umbrella JIRA to track all activities.

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-07 Thread Takeshi Yamamuro
Yea, +1 on Jungtaek's suggestion; having the same strict policy for adding new APIs looks nice. > When we making the API changes (e.g., adding the new APIs or changing the existing APIs), we should regularly publish them in the dev list. I am willing to lead this effort, work with my colleagues

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-07 Thread Xiao Li
I want to thank you *Ruifeng Zheng* publicly for his work that lists all the signature differences of Core, SQL and Hive we made in this upcoming release. For details, please read the files attached in SPARK-30982 . I went over these files and

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-07 Thread Jungtaek Lim
+1 for Sean as well. Moreover, as I added a voice on previous thread, if we want to be strict with retaining public API, what we really need to do along with this is having similar level or stricter of policy for adding public API. If we don't apply the policy symmetrically, problems would go

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-07 Thread Dongjoon Hyun
+1 for Sean's concerns and questions. Bests, Dongjoon. On Fri, Mar 6, 2020 at 3:14 PM Sean Owen wrote: > This thread established some good general principles, illustrated by a few > good examples. It didn't draw specific conclusions about what to add back, > which is why it wasn't at all

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-06 Thread Sean Owen
This thread established some good general principles, illustrated by a few good examples. It didn't draw specific conclusions about what to add back, which is why it wasn't at all controversial. What it means in specific cases is where there may be disagreement, and that harder question hasn't

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-06 Thread Dongjoon Hyun
Hi, All. Recently, reverting PRs seems to start to spread like the *well-known* virus. Can we finalize this first before doing unofficial personal decisions? Technically, this thread was not a vote and our website doesn't have a clear policy yet. https://github.com/apache/spark/pull/27821

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-03-05 Thread Dongjoon Hyun
Hi, All. There is a on-going Xiao's PR referencing this email. https://github.com/apache/spark/pull/27821 Bests, Dongjoon. On Fri, Feb 28, 2020 at 11:20 AM Sean Owen wrote: > On Fri, Feb 28, 2020 at 12:03 PM Holden Karau > wrote: > >> 1. Could you estimate how many revert commits are

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-28 Thread Sean Owen
On Fri, Feb 28, 2020 at 12:03 PM Holden Karau wrote: >> 1. Could you estimate how many revert commits are required in >> `branch-3.0` for new rubric? Fair question about what actual change this implies for 3.0? so far it seems like some targeted, quite reasonable reverts. I don't think

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-28 Thread Holden Karau
On Fri, Feb 28, 2020 at 9:48 AM Dongjoon Hyun wrote: > Hi, Matei and Michael. > > I'm also a big supporter for policy-based project management. > > Before going further, > > 1. Could you estimate how many revert commits are required in > `branch-3.0` for new rubric? > 2. Are you going to

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-28 Thread Dongjoon Hyun
Hi, Matei and Michael. I'm also a big supporter for policy-based project management. Before going further, 1. Could you estimate how many revert commits are required in `branch-3.0` for new rubric? 2. Are you going to revert all removed test cases for the deprecated ones? 3. Does it

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-27 Thread Matei Zaharia
+1 on this new rubric. It definitely captures the issues I’ve seen in Spark and in other projects. If we write down this rubric (or something like it), it will also be easier to refer to it during code reviews or in proposals of new APIs (we could ask “do you expect to have to change this API

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-27 Thread Michael Armbrust
Thanks for the discussion! A few responses: The decision needs to happen at api/config change time, otherwise the > deprecated warning has no purpose if we are never going to remove them. > Even if we never remove an API, I think deprecation warnings (when done right) can still serve a purpose.

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-27 Thread Tom Graves
In general +1 I think these are good guidelines and making it easier to upgrade is beneficial to everyone.  The decision needs to happen at api/config change time, otherwise the deprecated warning has no purpose if we are never going to remove them.That said we still need to be able to remove

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-27 Thread Sean Owen
Those are all quite reasonable guidelines and I'd put them into the contributing or developer guide, sure. Although not argued here, I think we should go further than codifying and enforcing common-sense guidelines like these. I think bias should shift in favor of retaining APIs going forward, and

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-26 Thread Jules Damji
+1 Well said! Sent from my iPhone Pardon the dumb thumb typos :) > On Feb 24, 2020, at 3:03 PM, Michael Armbrust wrote: > >  > Hello Everyone, > > As more users have started upgrading to Spark 3.0 preview (including myself), > there have been many discussions around APIs that have been

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-26 Thread Michel Miotto Barbosa
+1 *_* *Michel Miotto Barbosa*, *Data Science/Software Engineer* Learn MBA Global Financial Broker at IBMEC SP, Learn Economic Science at PUC SP MBA in Project Management, Graduate i n S oftware E ngineering phone: +55 11 984 342 347,

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-26 Thread John Zhuge
Well written, Michael! Believe it or not, I read through the entire email, very rare for emails of such length. Happy to see healthy discussions on this tough subject. Definitely need perspectives form both the users and the contributors. On Tue, Feb 25, 2020 at 9:09 PM Xiao Li wrote: > +1 >

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-25 Thread Xiao Li
+1 Xiao Michael Armbrust 于2020年2月24日周一 下午3:03写道: > Hello Everyone, > > As more users have started upgrading to Spark 3.0 preview (including > myself), there have been many discussions around APIs that have been broken > compared with Spark 2.x. In many of these discussions, one of the >

[Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-24 Thread Michael Armbrust
Hello Everyone, As more users have started upgrading to Spark 3.0 preview (including myself), there have been many discussions around APIs that have been broken compared with Spark 2.x. In many of these discussions, one of the rationales for breaking an API seems to be "Spark follows semantic