Re: Spark 3.0 preview release 2?

Tom Graves Tue, 10 Dec 2019 06:29:08 -0800

 +1 for another preview
Tom
    On Monday, December 9, 2019, 12:32:29 AM CST, Xiao Li 
<[email protected]> wrote:  
 
 
I got many great feedbacks from the community about the recent 3.0 preview 
release. Since the last 3.0 preview release, we already have 353 commits 
[https://github.com/apache/spark/compare/v3.0.0-preview...master]. There are 
various important features and behavior changes we want the community to try 
before entering the official release candidates of Spark 3.0.






Below is my selected items that are not part of the last 3.0 preview but 
already available in the upstream master branch: 


   
   - Support JDK 11 with Hadoop 2.7
   - Spark SQL will respect its own default format (i.e., parquet) when users 
do CREATE TABLE without USING or STORED AS clauses
   - Enable Parquet nested schema pruning and nested pruning on expressions by 
default
   - Add observable Metrics for Streaming queries
   - Column pruning through nondeterministic expressions
   - RecordBinaryComparator should check endianness when compared by long 
   - Improve parallelism for local shuffle reader in adaptive query execution
   - Upgrade Apache Arrow to version 0.15.1
   - Various interval-related SQL support
   - Add a mode to pin Python thread into JVM's
   - Provide option to clean up completed files in streaming query



I am wondering if we can have another preview release for Spark 3.0? This can 
help us find the design/API defects as early as possible and avoid the 
significant delay of the upcoming Spark 3.0 release




Also, any committer is willing to volunteer as the release manager of the next 
preview release of Spark 3.0, if we have such a release? 




Cheers,




Xiao

Re: Spark 3.0 preview release 2?

Reply via email to