Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Yi Wu
+1 to continue the incompleted push-based shuffle. -- Yi On Fri, Feb 26, 2021 at 1:26 AM Mridul Muralidharan wrote: > > > Nit: Java 17 -> should be available by Sept 2021 :-) > Adoption would also depend on some of our nontrivial dependencies > supporting it - it might be a stretch to get it

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Cheng Su
Hi, Just want to share something I am working on in 3.2 if these matter. * Shuffled hash join improvement (SPARK-32461) * This is one of release notes JIRAs in 3.1, and major thing left is sort-based fallback and code-gen for FULL OUTER join. * Join and aggregation code-gen

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
Sure, thank you, Hyukjin. Bests, Dongjoon. On Fri, Feb 26, 2021 at 4:01 PM Hyukjin Kwon wrote: > I have an idea which I'll send an email to discuss next or a week after > the next week. I did not have enough bandwidth to drive both together at > the same time. I would appreciate if we have

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Hyukjin Kwon
I have an idea which I'll send an email to discuss next or a week after the next week. I did not have enough bandwidth to drive both together at the same time. I would appreciate if we have some more time for 3.2. In addition, It would also be great if we follow the schedule and catch potential

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
Thank you for sharing your plan, Huaxin! Bests, Dongjoon. On Fri, Feb 26, 2021 at 12:20 PM huaxin gao wrote: > Thanks Dongjoon and Xiao for the discussion. I would like to add Data > Source V2 Aggregate push down to the list. I am currently working on > JDBC Data Source V2 Aggregate push

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
On Fri, Feb 26, 2021 at 11:13 AM Xiao Li wrote: > Do we have enough features in the current master branch? > Hi, Xiao. Is this a question to Sean's previous comment, `There is already some good stuff in 3.2 and will be a good minor release in 5-6 months.`? On Thu, Feb 25, 2021 at 9:33 AM Sean

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread huaxin gao
Thanks Dongjoon and Xiao for the discussion. I would like to add Data Source V2 Aggregate push down to the list. I am currently working on JDBC Data Source V2 Aggregate push down, but the common code can be used for the file based V2 Data Source as well. For example, MAX and MIN can be pushed down

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Xiao Li
Thank you, Dongjoon, for initiating this discussion. Let us keep it open. It might take 1-2 weeks to collect from the community all the features we plan to build and ship in 3.2 since we just finished the 3.1 voting. > 3. +100 for Apache Spark 3.2.0 in July 2021. Maybe, we need `branch-cut` > in

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
Thank you, Mridul and Sean. 1. Yes, `2017` was a typo. Java 17 is scheduled September 2021. And, of course, it's a nice-to-have status. :) 2. `Push based shuffle and disaggregated shuffle`. Definitely. Thanks for sharing, 3. +100 for Apache Spark 3.2.0 in July 2021. Maybe, we need `branch-cut`

[VOTE][RESULT] Release Spark 3.1.1 (RC3)

2021-02-26 Thread Hyukjin Kwon
The vote passes with 15 +1s (6 binding +1s). (* = binding) +1 - Hyukjin Kwon * - Jungtaek Lim - Herman van Hovell * - Sean Owen * - Yuming Wang - Gengliang Wang - John Zhuge - Takeshi Yamamuro - Cheng Su - Maxim Gekk - Gabor Somogyi - Dongjoon Hyun * - Terry Kim - Mridul Muralidharan * - Xiao Li