spark git commit: [SPARK-22937][SQL] SQL elt output binary for binary inputs

2018-01-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master ea9568330 -> e8af7e8ae


[SPARK-22937][SQL] SQL elt output binary for binary inputs

## What changes were proposed in this pull request?
This pr modified `elt` to output binary for binary inputs.
`elt` in the current master always output data as a string. But, in some 
databases (e.g., MySQL), if all inputs are binary, `elt` also outputs binary 
(Also, this might be a small surprise).
This pr is related to #19977.

## How was this patch tested?
Added tests in `SQLQueryTestSuite` and `TypeCoercionSuite`.

Author: Takeshi Yamamuro 

Closes #20135 from maropu/SPARK-22937.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e8af7e8a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e8af7e8a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e8af7e8a

Branch: refs/heads/master
Commit: e8af7e8aeca15a6107248f358d9514521ffdc6d3
Parents: ea95683
Author: Takeshi Yamamuro 
Authored: Sat Jan 6 09:26:03 2018 +0800
Committer: gatorsmile 
Committed: Sat Jan 6 09:26:03 2018 +0800

--
 docs/sql-programming-guide.md   |   2 +
 .../sql/catalyst/analysis/TypeCoercion.scala|  29 +
 .../expressions/stringExpressions.scala |  46 +---
 .../org/apache/spark/sql/internal/SQLConf.scala |   8 ++
 .../catalyst/analysis/TypeCoercionSuite.scala   |  54 +
 .../inputs/typeCoercion/native/elt.sql  |  44 +++
 .../results/typeCoercion/native/elt.sql.out | 115 +++
 7 files changed, 281 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e8af7e8a/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index dc3e384..b50f936 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1783,6 +1783,8 @@ options.
 
  - Since Spark 2.3, when all inputs are binary, `functions.concat()` returns 
an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it 
always returns as a string despite of input types. To keep the old behavior, 
set `spark.sql.function.concatBinaryAsString` to `true`.
 
+ - Since Spark 2.3, when all inputs are binary, SQL `elt()` returns an output 
as binary. Otherwise, it returns as a string. Until Spark 2.3, it always 
returns as a string despite of input types. To keep the old behavior, set 
`spark.sql.function.eltOutputAsString` to `true`.
+
 ## Upgrading From Spark SQL 2.1 to 2.2
 
   - Spark 2.1.1 introduced a new configuration key: 
`spark.sql.hive.caseSensitiveInferenceMode`. It had a default setting of 
`NEVER_INFER`, which kept behavior identical to 2.1.0. However, Spark 2.2.0 
changes this setting's default value to `INFER_AND_SAVE` to restore 
compatibility with reading Hive metastore tables whose underlying file schema 
have mixed-case column names. With the `INFER_AND_SAVE` configuration value, on 
first access Spark will perform schema inference on any Hive metastore table 
for which it has not already saved an inferred schema. Note that schema 
inference can be a very time consuming operation for tables with thousands of 
partitions. If compatibility with mixed-case column names is not a concern, you 
can safely set `spark.sql.hive.caseSensitiveInferenceMode` to `NEVER_INFER` to 
avoid the initial overhead of schema inference. Note that with the new default 
`INFER_AND_SAVE` setting, the results of the schema inference are saved as a 
metastore key for future use
 . Therefore, the initial schema inference occurs only at a table's first 
access.

http://git-wip-us.apache.org/repos/asf/spark/blob/e8af7e8a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
index e943636..e8669c4 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
@@ -54,6 +54,7 @@ object TypeCoercion {
   BooleanEquality ::
   FunctionArgumentConversion ::
   ConcatCoercion(conf) ::
+  EltCoercion(conf) ::
   CaseWhenCoercion ::
   IfCoercion ::
   StackCoercion ::
@@ -685,6 +686,34 @@ object TypeCoercion {
   }
 
   /**
+   * Coerces the types of [[Elt]] children to expected ones.
+   *
+   * If `spark.sql.function.eltOutputAsString` is false and all children types 
are binary,
+   * the expected 

spark git commit: [SPARK-22937][SQL] SQL elt output binary for binary inputs

2018-01-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/branch-2.3 55afac4e7 -> bf853018c


[SPARK-22937][SQL] SQL elt output binary for binary inputs

## What changes were proposed in this pull request?
This pr modified `elt` to output binary for binary inputs.
`elt` in the current master always output data as a string. But, in some 
databases (e.g., MySQL), if all inputs are binary, `elt` also outputs binary 
(Also, this might be a small surprise).
This pr is related to #19977.

## How was this patch tested?
Added tests in `SQLQueryTestSuite` and `TypeCoercionSuite`.

Author: Takeshi Yamamuro 

Closes #20135 from maropu/SPARK-22937.

(cherry picked from commit e8af7e8aeca15a6107248f358d9514521ffdc6d3)
Signed-off-by: gatorsmile 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bf853018
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bf853018
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bf853018

Branch: refs/heads/branch-2.3
Commit: bf853018cabcd3b3abf84bfe534d2981020b4a71
Parents: 55afac4
Author: Takeshi Yamamuro 
Authored: Sat Jan 6 09:26:03 2018 +0800
Committer: gatorsmile 
Committed: Sat Jan 6 09:26:21 2018 +0800

--
 docs/sql-programming-guide.md   |   2 +
 .../sql/catalyst/analysis/TypeCoercion.scala|  29 +
 .../expressions/stringExpressions.scala |  46 +---
 .../org/apache/spark/sql/internal/SQLConf.scala |   8 ++
 .../catalyst/analysis/TypeCoercionSuite.scala   |  54 +
 .../inputs/typeCoercion/native/elt.sql  |  44 +++
 .../results/typeCoercion/native/elt.sql.out | 115 +++
 7 files changed, 281 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bf853018/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index dc3e384..b50f936 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1783,6 +1783,8 @@ options.
 
  - Since Spark 2.3, when all inputs are binary, `functions.concat()` returns 
an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it 
always returns as a string despite of input types. To keep the old behavior, 
set `spark.sql.function.concatBinaryAsString` to `true`.
 
+ - Since Spark 2.3, when all inputs are binary, SQL `elt()` returns an output 
as binary. Otherwise, it returns as a string. Until Spark 2.3, it always 
returns as a string despite of input types. To keep the old behavior, set 
`spark.sql.function.eltOutputAsString` to `true`.
+
 ## Upgrading From Spark SQL 2.1 to 2.2
 
   - Spark 2.1.1 introduced a new configuration key: 
`spark.sql.hive.caseSensitiveInferenceMode`. It had a default setting of 
`NEVER_INFER`, which kept behavior identical to 2.1.0. However, Spark 2.2.0 
changes this setting's default value to `INFER_AND_SAVE` to restore 
compatibility with reading Hive metastore tables whose underlying file schema 
have mixed-case column names. With the `INFER_AND_SAVE` configuration value, on 
first access Spark will perform schema inference on any Hive metastore table 
for which it has not already saved an inferred schema. Note that schema 
inference can be a very time consuming operation for tables with thousands of 
partitions. If compatibility with mixed-case column names is not a concern, you 
can safely set `spark.sql.hive.caseSensitiveInferenceMode` to `NEVER_INFER` to 
avoid the initial overhead of schema inference. Note that with the new default 
`INFER_AND_SAVE` setting, the results of the schema inference are saved as a 
metastore key for future use
 . Therefore, the initial schema inference occurs only at a table's first 
access.

http://git-wip-us.apache.org/repos/asf/spark/blob/bf853018/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
index e943636..e8669c4 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
@@ -54,6 +54,7 @@ object TypeCoercion {
   BooleanEquality ::
   FunctionArgumentConversion ::
   ConcatCoercion(conf) ::
+  EltCoercion(conf) ::
   CaseWhenCoercion ::
   IfCoercion ::
   StackCoercion ::
@@ -685,6 +686,34 @@ object TypeCoercion {
   }
 
   /**
+   * Coerces the types of [[Elt]] children to expected