spark git commit: [SPARK-22937][SQL] SQL elt output binary for binary inputs
Repository: spark Updated Branches: refs/heads/master ea9568330 -> e8af7e8ae [SPARK-22937][SQL] SQL elt output binary for binary inputs ## What changes were proposed in this pull request? This pr modified `elt` to output binary for binary inputs. `elt` in the current master always output data as a string. But, in some databases (e.g., MySQL), if all inputs are binary, `elt` also outputs binary (Also, this might be a small surprise). This pr is related to #19977. ## How was this patch tested? Added tests in `SQLQueryTestSuite` and `TypeCoercionSuite`. Author: Takeshi YamamuroCloses #20135 from maropu/SPARK-22937. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e8af7e8a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e8af7e8a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e8af7e8a Branch: refs/heads/master Commit: e8af7e8aeca15a6107248f358d9514521ffdc6d3 Parents: ea95683 Author: Takeshi Yamamuro Authored: Sat Jan 6 09:26:03 2018 +0800 Committer: gatorsmile Committed: Sat Jan 6 09:26:03 2018 +0800 -- docs/sql-programming-guide.md | 2 + .../sql/catalyst/analysis/TypeCoercion.scala| 29 + .../expressions/stringExpressions.scala | 46 +--- .../org/apache/spark/sql/internal/SQLConf.scala | 8 ++ .../catalyst/analysis/TypeCoercionSuite.scala | 54 + .../inputs/typeCoercion/native/elt.sql | 44 +++ .../results/typeCoercion/native/elt.sql.out | 115 +++ 7 files changed, 281 insertions(+), 17 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e8af7e8a/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index dc3e384..b50f936 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1783,6 +1783,8 @@ options. - Since Spark 2.3, when all inputs are binary, `functions.concat()` returns an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it always returns as a string despite of input types. To keep the old behavior, set `spark.sql.function.concatBinaryAsString` to `true`. + - Since Spark 2.3, when all inputs are binary, SQL `elt()` returns an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it always returns as a string despite of input types. To keep the old behavior, set `spark.sql.function.eltOutputAsString` to `true`. + ## Upgrading From Spark SQL 2.1 to 2.2 - Spark 2.1.1 introduced a new configuration key: `spark.sql.hive.caseSensitiveInferenceMode`. It had a default setting of `NEVER_INFER`, which kept behavior identical to 2.1.0. However, Spark 2.2.0 changes this setting's default value to `INFER_AND_SAVE` to restore compatibility with reading Hive metastore tables whose underlying file schema have mixed-case column names. With the `INFER_AND_SAVE` configuration value, on first access Spark will perform schema inference on any Hive metastore table for which it has not already saved an inferred schema. Note that schema inference can be a very time consuming operation for tables with thousands of partitions. If compatibility with mixed-case column names is not a concern, you can safely set `spark.sql.hive.caseSensitiveInferenceMode` to `NEVER_INFER` to avoid the initial overhead of schema inference. Note that with the new default `INFER_AND_SAVE` setting, the results of the schema inference are saved as a metastore key for future use . Therefore, the initial schema inference occurs only at a table's first access. http://git-wip-us.apache.org/repos/asf/spark/blob/e8af7e8a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala index e943636..e8669c4 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala @@ -54,6 +54,7 @@ object TypeCoercion { BooleanEquality :: FunctionArgumentConversion :: ConcatCoercion(conf) :: + EltCoercion(conf) :: CaseWhenCoercion :: IfCoercion :: StackCoercion :: @@ -685,6 +686,34 @@ object TypeCoercion { } /** + * Coerces the types of [[Elt]] children to expected ones. + * + * If `spark.sql.function.eltOutputAsString` is false and all children types are binary, + * the expected
spark git commit: [SPARK-22937][SQL] SQL elt output binary for binary inputs
Repository: spark Updated Branches: refs/heads/branch-2.3 55afac4e7 -> bf853018c [SPARK-22937][SQL] SQL elt output binary for binary inputs ## What changes were proposed in this pull request? This pr modified `elt` to output binary for binary inputs. `elt` in the current master always output data as a string. But, in some databases (e.g., MySQL), if all inputs are binary, `elt` also outputs binary (Also, this might be a small surprise). This pr is related to #19977. ## How was this patch tested? Added tests in `SQLQueryTestSuite` and `TypeCoercionSuite`. Author: Takeshi YamamuroCloses #20135 from maropu/SPARK-22937. (cherry picked from commit e8af7e8aeca15a6107248f358d9514521ffdc6d3) Signed-off-by: gatorsmile Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bf853018 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bf853018 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bf853018 Branch: refs/heads/branch-2.3 Commit: bf853018cabcd3b3abf84bfe534d2981020b4a71 Parents: 55afac4 Author: Takeshi Yamamuro Authored: Sat Jan 6 09:26:03 2018 +0800 Committer: gatorsmile Committed: Sat Jan 6 09:26:21 2018 +0800 -- docs/sql-programming-guide.md | 2 + .../sql/catalyst/analysis/TypeCoercion.scala| 29 + .../expressions/stringExpressions.scala | 46 +--- .../org/apache/spark/sql/internal/SQLConf.scala | 8 ++ .../catalyst/analysis/TypeCoercionSuite.scala | 54 + .../inputs/typeCoercion/native/elt.sql | 44 +++ .../results/typeCoercion/native/elt.sql.out | 115 +++ 7 files changed, 281 insertions(+), 17 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/bf853018/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index dc3e384..b50f936 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1783,6 +1783,8 @@ options. - Since Spark 2.3, when all inputs are binary, `functions.concat()` returns an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it always returns as a string despite of input types. To keep the old behavior, set `spark.sql.function.concatBinaryAsString` to `true`. + - Since Spark 2.3, when all inputs are binary, SQL `elt()` returns an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it always returns as a string despite of input types. To keep the old behavior, set `spark.sql.function.eltOutputAsString` to `true`. + ## Upgrading From Spark SQL 2.1 to 2.2 - Spark 2.1.1 introduced a new configuration key: `spark.sql.hive.caseSensitiveInferenceMode`. It had a default setting of `NEVER_INFER`, which kept behavior identical to 2.1.0. However, Spark 2.2.0 changes this setting's default value to `INFER_AND_SAVE` to restore compatibility with reading Hive metastore tables whose underlying file schema have mixed-case column names. With the `INFER_AND_SAVE` configuration value, on first access Spark will perform schema inference on any Hive metastore table for which it has not already saved an inferred schema. Note that schema inference can be a very time consuming operation for tables with thousands of partitions. If compatibility with mixed-case column names is not a concern, you can safely set `spark.sql.hive.caseSensitiveInferenceMode` to `NEVER_INFER` to avoid the initial overhead of schema inference. Note that with the new default `INFER_AND_SAVE` setting, the results of the schema inference are saved as a metastore key for future use . Therefore, the initial schema inference occurs only at a table's first access. http://git-wip-us.apache.org/repos/asf/spark/blob/bf853018/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala index e943636..e8669c4 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala @@ -54,6 +54,7 @@ object TypeCoercion { BooleanEquality :: FunctionArgumentConversion :: ConcatCoercion(conf) :: + EltCoercion(conf) :: CaseWhenCoercion :: IfCoercion :: StackCoercion :: @@ -685,6 +686,34 @@ object TypeCoercion { } /** + * Coerces the types of [[Elt]] children to expected