Is this a list of items that might be focused on for the final 3.0
release? At least, Scala 2.13 support shouldn't be on that list. The
others look plausible, or are already done, but there are probably
more.

As for the 3.0 preview, I wouldn't necessarily block on any particular
feature, though, yes, the more work that can go into important items
between now and then, the better.
I wouldn't necessarily present any list of things that will or might
be in 3.0 with that preview; just list the things that are done, like
JDK 11 support.

On Fri, Sep 20, 2019 at 2:46 AM Xingbo Jiang <jiangxb1...@gmail.com> wrote:
>
> Hi all,
>
> Let's start a new thread to discuss the on-going features for Spark 3.0 
> preview release.
>
> Below is the feature list for the Spark 3.0 preview release. The list is 
> collected from the previous discussions in the dev list.
>
> Followup of the shuffle+repartition correctness issue: support roll back 
> shuffle stages (https://issues.apache.org/jira/browse/SPARK-25341)
> Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 
> (https://issues.apache.org/jira/browse/SPARK-23710)
> JDK 11 support (https://issues.apache.org/jira/browse/SPARK-28684)
> Scala 2.13 support (https://issues.apache.org/jira/browse/SPARK-25075)
> DataSourceV2 features
>
> Enable file source v2 writers 
> (https://issues.apache.org/jira/browse/SPARK-27589)
> CREATE TABLE USING with DataSourceV2
> New pushdown API for DataSourceV2
> Support DELETE/UPDATE/MERGE Operations in DataSourceV2 
> (https://issues.apache.org/jira/browse/SPARK-28303)
>
> Correctness issue: Stream-stream joins - left outer join gives inconsistent 
> output (https://issues.apache.org/jira/browse/SPARK-26154)
> Revisiting Python / pandas UDF 
> (https://issues.apache.org/jira/browse/SPARK-28264)
> Spark Graph (https://issues.apache.org/jira/browse/SPARK-25994)
>
> Features that are nice to have:
>
> Use remote storage for persisting shuffle data 
> (https://issues.apache.org/jira/browse/SPARK-25299)
> Spark + Hadoop + Parquet + Avro compatibility problems 
> (https://issues.apache.org/jira/browse/SPARK-25588)
> Introduce new option to Kafka source - specify timestamp to start and end 
> offset (https://issues.apache.org/jira/browse/SPARK-26848)
> Delete files after processing in structured streaming 
> (https://issues.apache.org/jira/browse/SPARK-20568)
>
> Here, I am proposing to cut the branch on October 15th. If the features are 
> targeting to 3.0 preview release, please prioritize the work and finish it 
> before the date. Note, Oct. 15th is not the code freeze of Spark 3.0. That 
> means, the community will still work on the features for the upcoming Spark 
> 3.0 release, even if they are not included in the preview release. The goal 
> of preview release is to collect more feedback from the community regarding 
> the new 3.0 features/behavior changes.
>
> Thanks!

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to