Thank you for the summarization, Xingbo. I also agree with Sean because I don't think those block 3.0.0 preview release. Especially, correctness issues should not be there.
Instead, could you summarize what we have as of now for 3.0.0 preview? I believe JDK11 (SPARK-28684) and Hive 2.3.5 (SPARK-23710) will be in the what-we-have list for 3.0.0 preview. Bests, Dongjoon. On Fri, Sep 20, 2019 at 6:22 AM Sean Owen <sro...@gmail.com> wrote: > Is this a list of items that might be focused on for the final 3.0 > release? At least, Scala 2.13 support shouldn't be on that list. The > others look plausible, or are already done, but there are probably > more. > > As for the 3.0 preview, I wouldn't necessarily block on any particular > feature, though, yes, the more work that can go into important items > between now and then, the better. > I wouldn't necessarily present any list of things that will or might > be in 3.0 with that preview; just list the things that are done, like > JDK 11 support. > > On Fri, Sep 20, 2019 at 2:46 AM Xingbo Jiang <jiangxb1...@gmail.com> > wrote: > > > > Hi all, > > > > Let's start a new thread to discuss the on-going features for Spark 3.0 > preview release. > > > > Below is the feature list for the Spark 3.0 preview release. The list is > collected from the previous discussions in the dev list. > > > > Followup of the shuffle+repartition correctness issue: support roll back > shuffle stages (https://issues.apache.org/jira/browse/SPARK-25341) > > Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 ( > https://issues.apache.org/jira/browse/SPARK-23710) > > JDK 11 support (https://issues.apache.org/jira/browse/SPARK-28684) > > Scala 2.13 support (https://issues.apache.org/jira/browse/SPARK-25075) > > DataSourceV2 features > > > > Enable file source v2 writers ( > https://issues.apache.org/jira/browse/SPARK-27589) > > CREATE TABLE USING with DataSourceV2 > > New pushdown API for DataSourceV2 > > Support DELETE/UPDATE/MERGE Operations in DataSourceV2 ( > https://issues.apache.org/jira/browse/SPARK-28303) > > > > Correctness issue: Stream-stream joins - left outer join gives > inconsistent output (https://issues.apache.org/jira/browse/SPARK-26154) > > Revisiting Python / pandas UDF ( > https://issues.apache.org/jira/browse/SPARK-28264) > > Spark Graph (https://issues.apache.org/jira/browse/SPARK-25994) > > > > Features that are nice to have: > > > > Use remote storage for persisting shuffle data ( > https://issues.apache.org/jira/browse/SPARK-25299) > > Spark + Hadoop + Parquet + Avro compatibility problems ( > https://issues.apache.org/jira/browse/SPARK-25588) > > Introduce new option to Kafka source - specify timestamp to start and > end offset (https://issues.apache.org/jira/browse/SPARK-26848) > > Delete files after processing in structured streaming ( > https://issues.apache.org/jira/browse/SPARK-20568) > > > > Here, I am proposing to cut the branch on October 15th. If the features > are targeting to 3.0 preview release, please prioritize the work and finish > it before the date. Note, Oct. 15th is not the code freeze of Spark 3.0. > That means, the community will still work on the features for the upcoming > Spark 3.0 release, even if they are not included in the preview release. > The goal of preview release is to collect more feedback from the community > regarding the new 3.0 features/behavior changes. > > > > Thanks! > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >