+1 for another preview
Tom
On Monday, December 9, 2019, 12:32:29 AM CST, Xiao Li
<[email protected]> wrote:
I got many great feedbacks from the community about the recent 3.0 preview
release. Since the last 3.0 preview release, we already have 353 commits
[https://github.com/apache/spark/compare/v3.0.0-preview...master]. There are
various important features and behavior changes we want the community to try
before entering the official release candidates of Spark 3.0.
Below is my selected items that are not part of the last 3.0 preview but
already available in the upstream master branch:
- Support JDK 11 with Hadoop 2.7
- Spark SQL will respect its own default format (i.e., parquet) when users
do CREATE TABLE without USING or STORED AS clauses
- Enable Parquet nested schema pruning and nested pruning on expressions by
default
- Add observable Metrics for Streaming queries
- Column pruning through nondeterministic expressions
- RecordBinaryComparator should check endianness when compared by long
- Improve parallelism for local shuffle reader in adaptive query execution
- Upgrade Apache Arrow to version 0.15.1
- Various interval-related SQL support
- Add a mode to pin Python thread into JVM's
- Provide option to clean up completed files in streaming query
I am wondering if we can have another preview release for Spark 3.0? This can
help us find the design/API defects as early as possible and avoid the
significant delay of the upcoming Spark 3.0 release
Also, any committer is willing to volunteer as the release manager of the next
preview release of Spark 3.0, if we have such a release?
Cheers,
Xiao