Ryan's proposal makes a lot of sense. Its better not to release half-baked changes in 2.4 which not only breaks a lot of the APIs released in 2.3, but also expected to change further due redesigns before 3.0 so don't see much value releasing it in 2.4.
On Sun, 9 Sep 2018 at 22:42, Wenchen Fan <cloud0...@gmail.com> wrote: > Strictly speaking, data source v2 is always half-finished until we mark it > as stable. We need some small milestones to move forward step by step. > > The redesign also happens in an incremental way. SPARK-24882 mostly focus > on the "RDD" part of the API: the separation of reader factory and input > partitions, the introduction of ScanConfig, etc. Then we focus on the > high-level abstraction and want to change the "table" part of the API. > > In my understanding, each PR should be self-contained. If we are OK to > have SPARK-24882 in master as an individual commit, I think it's also OK to > have it in branch 2.4. > > I've created https://issues.apache.org/jira/browse/SPARK-25390 to track > the new abstraction. It doesn't change the API a lot, but update the > streaming execution engine quite a bit. > > Thanks, > Wenchen > > On Mon, Sep 10, 2018 at 4:20 AM Ryan Blue <rb...@netflix.com> wrote: > >> Wenchen, can you hold off on the first RC? >> >> The half-finished changes from the redesign of the DataSourceV2 API are >> in master, added in SPARK-24882 >> <https://github.com/apache/spark/pull/22009>, and are now in the 2.4 >> branch. We've had a lot of good discussion since that PR was merged to >> update and fix the design, plus only one of the follow-ups on SPARK-25186 >> <https://issues.apache.org/jira/browse/SPARK-25186> is done. Clearly, >> the redesign was too large to get into 2.4 in so little time -- it was >> proposed about 10 days before the original branch date -- and I don't think >> it is a good idea to release half-finished major changes. >> >> The easiest solution is to revert SPARK-24882 in the release branch. That >> way we have minor changes in 2.4 and major changes in the next release, >> instead of major changes in both. What does everyone think? >> >> rb >> >> On Fri, Sep 7, 2018 at 10:37 AM shane knapp <skn...@berkeley.edu> wrote: >> >>> ++joshrosen (thanks for the help w/deploying the jenkins configs) >>> >>> the basic 2.4 builds are deployed and building! >>> >>> i haven't created (a) build(s) yet for scala 2.12... i'll be >>> coordinating this w/the databricks folks next week. >>> >>> On Fri, Sep 7, 2018 at 9:53 AM, Dongjoon Hyun <dongjoon.h...@gmail.com> >>> wrote: >>> >>>> Thank you, Shane! :D >>>> >>>> Bests, >>>> Dongjoon. >>>> >>>> On Fri, Sep 7, 2018 at 9:51 AM shane knapp <skn...@berkeley.edu> wrote: >>>> >>>>> i'll try and get to the 2.4 branch stuff today... >>>>> >>>>> >>> >>> >>> -- >>> Shane Knapp >>> UC Berkeley EECS Research / RISELab Staff Technical Lead >>> https://rise.cs.berkeley.edu >>> >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >