Thank you all. BTW, Xiao and Mridul, I'm wondering what date you have in your mind specifically.
Usually, `Christmas and New Year season` doesn't give us much additional time. If you think so, could you make a PR for Apache Spark website according to your expectation? https://spark.apache.org/versioning-policy.html Bests, Dongjoon. On Sun, Oct 4, 2020 at 7:18 AM Mridul Muralidharan <mri...@gmail.com> wrote: > > +1 on pushing the branch cut for increased dev time to match previous > releases. > > Regards, > Mridul > > On Sat, Oct 3, 2020 at 10:22 PM Xiao Li <gatorsm...@gmail.com> wrote: > >> Thank you for your updates. >> >> Spark 3.0 got released on Jun 18, 2020. If Nov 1st is the target date of >> the 3.1 branch cut, the feature development time window is less than 5 >> months. This is shorter than what we did in Spark 2.3 and 2.4 releases. >> >> Below are three highly desirable feature work I am watching. Hopefully, >> we can finish them before the branch cut. >> >> - Support push-based shuffle to improve shuffle efficiency: >> https://issues.apache.org/jira/browse/SPARK-30602 >> - Unify create table syntax: >> https://issues.apache.org/jira/browse/SPARK-31257 >> - Bloom filter join: https://issues.apache.org/jira/browse/SPARK-32268 >> >> Thanks, >> >> Xiao >> >> >> Hyukjin Kwon <gurwls...@gmail.com> 于2020年10月3日周六 下午5:41写道: >> >>> Nice summary. Thanks Dongjoon. One minor correction -> I believe we >>> dropped R 3.5 and below at branch 2.4 as well. >>> >>> On Sun, 4 Oct 2020, 09:17 Dongjoon Hyun, <dongjoon.h...@gmail.com> >>> wrote: >>> >>>> Hi, All. >>>> >>>> As of today, master branch (Apache Spark 3.1.0) resolved >>>> 852+ JIRA issues and 606+ issues are 3.1.0-only patches. >>>> According to the 3.1.0 release window, branch-3.1 will be >>>> created on November 1st and enters QA period. >>>> >>>> Here are some notable updates I've been monitoring. >>>> >>>> *Language* >>>> 01. SPARK-25075 Support Scala 2.13 >>>> - Since SPARK-32926, Scala 2.13 build test has >>>> become a part of GitHub Action jobs. >>>> - After SPARK-33044, Scala 2.13 test will be >>>> a part of Jenkins jobs. >>>> 02. SPARK-29909 Drop Python 2 and Python 3.4 and 3.5 >>>> 03. SPARK-32082 Project Zen: Improving Python usability >>>> - 7 of 16 issues are resolved. >>>> 04. SPARK-32073 Drop R < 3.5 support >>>> - This is done for Spark 3.0.1 and 3.1.0. >>>> >>>> *Dependency* >>>> 05. SPARK-32058 Use Apache Hadoop 3.2.0 dependency >>>> - This changes the default dist. for better cloud support >>>> 06. SPARK-32981 Remove hive-1.2 distribution >>>> 07. SPARK-20202 Remove references to org.spark-project.hive >>>> - This will remove Hive 1.2.1 from source code >>>> 08. SPARK-29250 Upgrade to Hadoop 3.2.1 (WIP) >>>> >>>> *Core* >>>> 09. SPARK-27495 Support Stage level resource conf and scheduling >>>> - 11 of 15 issues are resolved >>>> 10. SPARK-25299 Use remote storage for persisting shuffle data >>>> - 8 of 14 issues are resolved >>>> >>>> *Resource Manager* >>>> 11. SPARK-33005 Kubernetes GA preparation >>>> - It is on the way and we are waiting for more feedback. >>>> >>>> *SQL* >>>> 12. SPARK-30648/SPARK-32346 Support filters pushdown >>>> to JSON/Avro >>>> 13. SPARK-32948/SPARK-32958 Add Json expression optimizer >>>> 14. SPARK-12312 Support JDBC Kerberos w/ keytab >>>> - 11 of 17 issues are resolved >>>> 15. SPARK-27589 DSv2 was mostly completed in 3.0 >>>> and added more features in 3.1 but still we missed >>>> - All built-in DataSource v2 write paths are disabled >>>> and v1 write is used instead. >>>> - Support partition pruning with subqueries >>>> - Support bucketing >>>> >>>> We still have one month before the feature freeze >>>> and starting QA. If you are working for 3.1, >>>> please consider the timeline and share your schedule >>>> with the Apache Spark community. For the other stuff, >>>> we can put it into 3.2 release scheduled in June 2021. >>>> >>>> Last not but least, I want to emphasize (7) once again. >>>> We need to remove the forked unofficial Hive eventually. >>>> Please let us know your reasons if you need to build >>>> from Apache Spark 3.1 source code for Hive 1.2. >>>> >>>> https://github.com/apache/spark/pull/29936 >>>> >>>> As I wrote in the above PR description, for old releases, >>>> Apache Spark 2.4(LTS) and 3.0 (~2021.12) will provide >>>> Hive 1.2-based distribution. >>>> >>>> Bests, >>>> Dongjoon. >>>> >>>