Should we also consider the shuffle service refactoring to support pluggable storage engines as targeting the 3.1 release?
On Mon, Jun 29, 2020 at 9:31 AM Maxim Gekk <maxim.g...@databricks.com> wrote: > Hi Dongjoon, > > I would add: > - Filters pushdown to JSON (https://github.com/apache/spark/pull/27366) > - Filters pushdown to other datasources like Avro > - Support nested attributes of filters pushed down to JSON > > Maxim Gekk > > Software Engineer > > Databricks, Inc. > > > On Mon, Jun 29, 2020 at 7:07 PM Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > >> Hi, All. >> >> After a short celebration of Apache Spark 3.0, I'd like to ask you the >> community opinion on Apache Spark 3.1 feature expectations. >> >> First of all, Apache Spark 3.1 is scheduled for December 2020. >> - https://spark.apache.org/versioning-policy.html >> >> I'm expecting the following items: >> >> 1. Support Scala 2.13 >> 2. Use Apache Hadoop 3.2 by default for better cloud support >> 3. Declaring Kubernetes Scheduler GA >> In my perspective, the last main missing piece was Dynamic allocation >> and >> - Dynamic allocation with shuffle tracking is already shipped at 3.0. >> - Dynamic allocation with worker decommission/data migration is >> targeting 3.1. (Thanks, Holden) >> 4. DSv2 Stabilization >> >> I'm aware of some more features which are on the way currently, but I >> love to hear the opinions from the main developers and more over the main >> users who need those features. >> >> Thank you in advance. Welcome for any comments. >> >> Bests, >> Dongjoon. >> > -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau