[DISCUSS] Apache Spark 3.0.1 Release

Yuanjian Li Tue, 23 Jun 2020 01:00:04 -0700

Hi dev-list,

I’m writing this to raise the discussion about Spark 3.0.1 feasibility
since 4 blocker issues were found after Spark 3.0.0:



   1.

   [SPARK-31990] <https://issues.apache.org/jira/browse/SPARK-31990> The
   state store compatibility broken will cause a correctness issue when
   Streaming query with `dropDuplicate` uses the checkpoint written by the old
   Spark version.
   2.

   [SPARK-32038] <https://issues.apache.org/jira/browse/SPARK-32038> The
   regression bug in handling NaN values in COUNT(DISTINCT)
   3.

   [SPARK-31918] <https://issues.apache.org/jira/browse/SPARK-31918>[WIP]
   CRAN requires to make it working with the latest R 4.0. It makes the 3.0
   release unavailable on CRAN, and only supports R [3.5, 4.0)
   4.

   [SPARK-31967] <https://issues.apache.org/jira/browse/SPARK-31967>
   Downgrade vis.js to fix Jobs UI loading time regression


I also noticed branch-3.0 already has 39 commits
<https://issues.apache.org/jira/browse/SPARK-32038?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%203.0.1>
after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to
deliver the critical fixes.

Any comments are appreciated.

Best,

Yuanjian

[DISCUSS] Apache Spark 3.0.1 Release

Reply via email to