Hi dev-list, I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0:
1. [SPARK-31990] <https://issues.apache.org/jira/browse/SPARK-31990> The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version. 2. [SPARK-32038] <https://issues.apache.org/jira/browse/SPARK-32038> The regression bug in handling NaN values in COUNT(DISTINCT) 3. [SPARK-31918] <https://issues.apache.org/jira/browse/SPARK-31918>[WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0) 4. [SPARK-31967] <https://issues.apache.org/jira/browse/SPARK-31967> Downgrade vis.js to fix Jobs UI loading time regression I also noticed branch-3.0 already has 39 commits <https://issues.apache.org/jira/browse/SPARK-32038?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%203.0.1> after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes. Any comments are appreciated. Best, Yuanjian