This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new ec47c3c [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1 ec47c3c is described below commit ec47c3c4394b2410a277e7f7105cf896c28b2ed4 Author: Jiaan Geng <belie...@163.com> AuthorDate: Wed Dec 1 16:19:50 2021 +0800 [SPARK-37513][SQL][DOC] date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1 ### What changes were proposed in this pull request? The SQL show below previously returned the date type, now it returns the timestamp type. `select date '2011-11-11' + interval 12 hours;` `select date '2011-11-11' - interval 12 hours;` The basic reason is: In Spark3.1 https://github.com/apache/spark/blob/75cac1fe0a46dbdf2ad5b741a3a49c9ab618cdce/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L338 In Spark3.2 https://github.com/apache/spark/blob/ceae41ba5cafb479cdcfc9a6a162945646a68f05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L376 Because Spark3.2 have been released, so we add the migration guide for it. ### Why are the changes needed? Give a migration guide for different behavior between Spark3.1 and Spark3.2. ### Does this PR introduce _any_ user-facing change? 'No'. Just modify the docs. ### How was this patch tested? No need. Closes #34766 from beliefer/SPARK-37513. Authored-by: Jiaan Geng <belie...@163.com> Signed-off-by: Wenchen Fan <wenc...@databricks.com> --- docs/sql-migration-guide.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 12d9cd4..c15f55d 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -133,6 +133,8 @@ license: | - In Spark 3.2, create/alter view will fail if the input query output columns contain auto-generated alias. This is necessary to make sure the query output column names are stable across different spark versions. To restore the behavior before Spark 3.2, set `spark.sql.legacy.allowAutoGeneratedAliasForView` to `true`. + - In Spark 3.2, date +/- interval with only day-time fields such as `date '2011-11-11' + interval 12 hours` returns timestamp. In Spark 3.1 and earlier, the same expression returns date. To restore the behavior before Spark 3.2, you can use `cast` to convert timestamp as date. + ## Upgrading from Spark SQL 3.0 to 3.1 - In Spark 3.1, statistical aggregation function includes `std`, `stddev`, `stddev_samp`, `variance`, `var_samp`, `skewness`, `kurtosis`, `covar_samp`, `corr` will return `NULL` instead of `Double.NaN` when `DivideByZero` occurs during expression evaluation, for example, when `stddev_samp` applied on a single element set. In Spark version 3.0 and earlier, it will return `Double.NaN` in such case. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.statisticalAggrega [...] --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org