I think we need to go through them during the 3.0 QA period, and try to fix the valid ones.
For example, the first ticket should be fixed already in https://issues.apache.org/jira/browse/SPARK-28344 On Mon, Jan 20, 2020 at 2:07 PM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Hi, All. > > According to our policy, "Correctness and data loss issues should be > considered Blockers". > > - http://spark.apache.org/contributing.html > > Since we are close to branch-3.0 cut, > I want to ask your opinions on the following correctness and data loss > issues. > > SPARK-30218 Columns used in inequality conditions for joins not > resolved correctly in case of common lineage > SPARK-29701 Different answers when empty input given in GROUPING SETS > SPARK-29699 Different answers in nested aggregates with window > functions > SPARK-29419 Seq.toDS / spark.createDataset(Seq) is not thread-safe > SPARK-28125 dataframes created by randomSplit have overlapping rows > SPARK-28067 Incorrect results in decimal aggregation with whole-stage > code gen enabled > SPARK-28024 Incorrect numeric values when out of range > SPARK-27784 Alias ID reuse can break correctness when substituting > foldable expressions > SPARK-27619 MapType should be prohibited in hash expressions > SPARK-27298 Dataset except operation gives different results(dataset > count) on Spark 2.3.0 Windows and Spark 2.3.0 Linux environment > SPARK-27282 Spark incorrect results when using UNION with GROUP BY > clause > SPARK-27213 Unexpected results when filter is used after distinct > SPARK-26836 Columns get switched in Spark SQL using Avro backed Hive > table if schema evolves > SPARK-25150 Joining DataFrames derived from the same source yields > confusing/incorrect results > SPARK-21774 The rule PromoteStrings cast string to a wrong data type > SPARK-19248 Regex_replace works in 1.6 but not in 2.0 > > Some of them are targeted on 3.0.0, but the others are not. > Although we will work on them until 3.0.0, > I'm not sure we can reach a status with no known correctness and data loss > issue. > > How do you think about the above issues? > > Bests, > Dongjoon. >