subject:"\[GitHub\] spark pull request #16777\: \[SPARK\-19435\]\[SQL\] Type coercion between ArrayTyp..."

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16777


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-13 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100898554
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -117,49 +115,67 @@ object TypeCoercion {
* loss of precision when widening decimal and double, and promotion to 
string.
*/
   private[analysis] def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = {
-(t1, t2) match {
-  case (t1: DecimalType, t2: DecimalType) =>
-Some(DecimalPrecision.widerDecimalType(t1, t2))
-  case (t: IntegralType, d: DecimalType) =>
-Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-  case (d: DecimalType, t: IntegralType) =>
-Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-  case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-Some(DoubleType)
-  case _ =>
-findTightestCommonTypeToString(t1, t2)
-}
+findTightestCommonType(t1, t2)
+  .orElse(findWiderTypeForDecimal(t1, t2))
--- End diff --

Yes. Integer will be promoted to a wider Decimal anyway. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-13 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100861589
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -117,49 +115,67 @@ object TypeCoercion {
* loss of precision when widening decimal and double, and promotion to 
string.
*/
   private[analysis] def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = {
-(t1, t2) match {
-  case (t1: DecimalType, t2: DecimalType) =>
-Some(DecimalPrecision.widerDecimalType(t1, t2))
-  case (t: IntegralType, d: DecimalType) =>
-Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-  case (d: DecimalType, t: IntegralType) =>
-Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-  case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-Some(DoubleType)
-  case _ =>
-findTightestCommonTypeToString(t1, t2)
-}
+findTightestCommonType(t1, t2)
+  .orElse(findWiderTypeForDecimal(t1, t2))
--- End diff --

yea we changed the order, but looks like it won't change the result. 
`findWiderTypeForDecimal` will always return a result for decimal type and 
numeric type, and if `findTightestCommonType` can return a result, 
`findWiderTypeForDecimal` will return the same result. So it doesn't matter if 
we run `findTightestCommonType` before or after it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100730262
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

I see. Thank you for catching it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100727682
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

We should make them consistent. That is why I think it is right to make the 
change, even if it causes the behavior changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100723132
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

Aha, thank you for correcting me. I overlooked but the result should be 
still the same, shouldn't it?

- `DecimalType.isWiderThan`

```
(p1 - s1) >= (p2 - s2) && s1 >= s2
```

- DecimalPrecision.widerDecimalType

```
 max(s1, s2) + max(p1-s1, p2-s2), max(s1, s2)
```

If both are different, we were already applying different type coercion 
rules between `findWiderTypeWithoutStringPromotion` and `findWiderTypeForTwo`, 
I guess we should match them with the same given 
https://github.com/apache/spark/pull/14439 ?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100719964
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

The original `findTightestCommonTypeToString` does not handle `DecimalType 
`.  However, this PR is calling the `findTightestCommonType ` at first. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100719832
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

The result is the same? 

See the cases in `findTightestCommonType`: 
https://github.com/HyukjinKwon/spark/blob/510a0eee43030abbf37ef922684e6165d6f1e1c8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L87-L90




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100716751
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

@cloud-fan refactored this logic recently and I believe he didn't missed 
this part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100716252
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

Yes, it is true that the type dispatch order was changed but 
`findTightestCommonType` does not take care of `DecimalType` therefore the 
results would be the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100715180
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -116,48 +114,66 @@ object TypeCoercion {
* i.e. the main difference with [[findTightestCommonType]] is that here 
we allow some
* loss of precision when widening decimal and double, and promotion to 
string.
*/
-  private def findWiderTypeForTwo(t1: DataType, t2: DataType): 
Option[DataType] = (t1, t2) match {
-case (t1: DecimalType, t2: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(t1, t2))
-case (t: IntegralType, d: DecimalType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (d: DecimalType, t: IntegralType) =>
-  Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
-case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
-  Some(DoubleType)
-case _ =>
-  findTightestCommonTypeToString(t1, t2)
+  def findWiderTypeForTwo(t1: DataType, t2: DataType): Option[DataType] = {
+findTightestCommonType(t1, t2)
--- End diff --

Previously, `findWiderTypeForDecimal ` is before 
`findTightestCommonTypeToString `.  Thus, the results could be different. cc 
@cloud-fan 

You changed the order. I am not sure whether this should be documented in 
the release note.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100690775
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +386,67 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTestWithStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  checkWidenType(TypeCoercion.findWiderTypeForTwo, t1, t2, expected)
+}
+
+def widenTestWithoutStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  
checkWidenType(TypeCoercion.findWiderTypeWithoutStringPromotionForTwo, t1, t2, 
expected)
+}
+
+// Decimal
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DecimalType(3, 2), Some(DecimalType(3, 2)))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DoubleType, Some(DoubleType))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), IntegerType, Some(DecimalType(11, 1)))
+widenTestWithStringPromotion(
+  DoubleType, DecimalType(2, 1), Some(DoubleType))
+widenTestWithStringPromotion(
+  LongType, DecimalType(2, 1), Some(DecimalType(21, 1)))
+
+// ArrayType
+widenTestWithStringPromotion(
+  ArrayType(ShortType, containsNull = true),
+  ArrayType(DoubleType, containsNull = false),
+  Some(ArrayType(DoubleType, containsNull = true)))
+widenTestWithStringPromotion(
+  ArrayType(TimestampType, containsNull = false),
+  ArrayType(StringType, containsNull = true),
+  Some(ArrayType(StringType, containsNull = true)))
+widenTestWithStringPromotion(
+  ArrayType(ArrayType(IntegerType), containsNull = false),
+  ArrayType(ArrayType(LongType), containsNull = false),
+  Some(ArrayType(ArrayType(LongType), containsNull = false)))
+
+// Without string promotion
+widenTestWithoutStringPromotion(IntegerType, StringType, None)
--- End diff --

`LongType` test was removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100687325
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +386,67 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTestWithStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  checkWidenType(TypeCoercion.findWiderTypeForTwo, t1, t2, expected)
+}
+
+def widenTestWithoutStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  
checkWidenType(TypeCoercion.findWiderTypeWithoutStringPromotionForTwo, t1, t2, 
expected)
+}
+
+// Decimal
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DecimalType(3, 2), Some(DecimalType(3, 2)))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DoubleType, Some(DoubleType))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), IntegerType, Some(DecimalType(11, 1)))
+widenTestWithStringPromotion(
+  DoubleType, DecimalType(2, 1), Some(DoubleType))
+widenTestWithStringPromotion(
+  LongType, DecimalType(2, 1), Some(DecimalType(21, 1)))
+
+// ArrayType
+widenTestWithStringPromotion(
+  ArrayType(ShortType, containsNull = true),
+  ArrayType(DoubleType, containsNull = false),
+  Some(ArrayType(DoubleType, containsNull = true)))
+widenTestWithStringPromotion(
+  ArrayType(TimestampType, containsNull = false),
+  ArrayType(StringType, containsNull = true),
+  Some(ArrayType(StringType, containsNull = true)))
+widenTestWithStringPromotion(
+  ArrayType(ArrayType(IntegerType), containsNull = false),
+  ArrayType(ArrayType(LongType), containsNull = false),
+  Some(ArrayType(ArrayType(LongType), containsNull = false)))
+
+// Without string promotion
+widenTestWithoutStringPromotion(IntegerType, StringType, None)
--- End diff --

I think we can just test int or long, not both


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100687309
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +386,67 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTestWithStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  checkWidenType(TypeCoercion.findWiderTypeForTwo, t1, t2, expected)
+}
+
+def widenTestWithoutStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  
checkWidenType(TypeCoercion.findWiderTypeWithoutStringPromotionForTwo, t1, t2, 
expected)
+}
+
+// Decimal
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DecimalType(3, 2), Some(DecimalType(3, 2)))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DoubleType, Some(DoubleType))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), IntegerType, Some(DecimalType(11, 1)))
+widenTestWithStringPromotion(
+  DoubleType, DecimalType(2, 1), Some(DoubleType))
--- End diff --

can we add a test for string? (not array(string))


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-12 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100687275
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +386,67 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTestWithStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  checkWidenType(TypeCoercion.findWiderTypeForTwo, t1, t2, expected)
+}
+
+def widenTestWithoutStringPromotion(
+t1: DataType,
+t2: DataType,
+expected: Option[DataType]): Unit = {
+  
checkWidenType(TypeCoercion.findWiderTypeWithoutStringPromotionForTwo, t1, t2, 
expected)
+}
+
+// Decimal
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DecimalType(3, 2), Some(DecimalType(3, 2)))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), DoubleType, Some(DoubleType))
+widenTestWithStringPromotion(
+  DecimalType(2, 1), IntegerType, Some(DecimalType(11, 1)))
+widenTestWithStringPromotion(
+  DoubleType, DecimalType(2, 1), Some(DoubleType))
--- End diff --

nit: how about we always put the decimal type in left? then we won't 
mistakenly add symmetric test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100678810
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +380,75 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTest(t1: DataType, t2: DataType, tightestCommon: 
Option[DataType]) {
+  var found = TypeCoercion.findWiderTypeForTwo(t1, t2)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t1 and $t2, 
found $found")
+  // Test both directions to make sure the widening is symmetric.
+  found = TypeCoercion.findWiderTypeForTwo(t2, t1)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t2 and $t1, 
found $found")
+}
+
+def widenTestWithoutStringPromotion(
+t1: DataType,
+t2: DataType,
+tightestCommon: Option[DataType]) {
+  var found = 
TypeCoercion.findWiderTypeWithoutStringPromotionForTwo(t1, t2)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t1 and $t2, 
found $found")
+  // Test both directions to make sure the widening is symmetric.
+  found = TypeCoercion.findWiderTypeWithoutStringPromotionForTwo(t2, 
t1)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t2 and $t1, 
found $found")
+}
+
+// Decimal
+widenTest(DecimalType(2, 1), DecimalType(3, 2), Some(DecimalType(3, 
2)))
+widenTest(DecimalType(2, 1), DoubleType, Some(DoubleType))
+widenTest(DecimalType(2, 1), IntegerType, Some(DecimalType(11, 1)))
+widenTest(DoubleType, DecimalType(2, 1), Some(DoubleType))
+widenTest(IntegerType, DecimalType(2, 1), Some(DecimalType(11, 1)))
+widenTest(LongType, DecimalType(2, 1), Some(DecimalType(21, 1)))
+
+// ArrayType
+widenTest(ArrayType(NullType), ArrayType(ByteType), 
Some(ArrayType(ByteType)))
--- End diff --

These tests are too similar. we just need one test case for numeric, one 
for string, one for nested array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100678782
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +380,75 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTest(t1: DataType, t2: DataType, tightestCommon: 
Option[DataType]) {
+  var found = TypeCoercion.findWiderTypeForTwo(t1, t2)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t1 and $t2, 
found $found")
+  // Test both directions to make sure the widening is symmetric.
+  found = TypeCoercion.findWiderTypeForTwo(t2, t1)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t2 and $t1, 
found $found")
+}
+
+def widenTestWithoutStringPromotion(
+t1: DataType,
+t2: DataType,
+tightestCommon: Option[DataType]) {
+  var found = 
TypeCoercion.findWiderTypeWithoutStringPromotionForTwo(t1, t2)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t1 and $t2, 
found $found")
+  // Test both directions to make sure the widening is symmetric.
+  found = TypeCoercion.findWiderTypeWithoutStringPromotionForTwo(t2, 
t1)
+  assert(found == tightestCommon,
+s"Expected $tightestCommon as wider common type for $t2 and $t1, 
found $found")
+}
+
+// Decimal
+widenTest(DecimalType(2, 1), DecimalType(3, 2), Some(DecimalType(3, 
2)))
+widenTest(DecimalType(2, 1), DoubleType, Some(DoubleType))
+widenTest(DecimalType(2, 1), IntegerType, Some(DecimalType(11, 1)))
+widenTest(DoubleType, DecimalType(2, 1), Some(DoubleType))
+widenTest(IntegerType, DecimalType(2, 1), Some(DecimalType(11, 1)))
--- End diff --

`widenTest` already tests symmetric


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100678739
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -379,6 +380,75 @@ class TypeCoercionSuite extends PlanTest {
 widenTest(ArrayType(IntegerType), StructType(Seq()), None)
   }
 
+  test("wider common type for decimal and array") {
+def widenTest(t1: DataType, t2: DataType, tightestCommon: 
Option[DataType]) {
--- End diff --

`tightestCommon` -> `expected`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100668061
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -101,13 +101,13 @@ object TypeCoercion {
 case _ => None
   }
 
-  /** Similar to [[findTightestCommonType]], but can promote all the way 
to StringType. */
-  def findTightestCommonTypeToString(left: DataType, right: DataType): 
Option[DataType] = {
-findTightestCommonType(left, right).orElse((left, right) match {
-  case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != 
BooleanType => Some(StringType)
-  case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != 
BooleanType => Some(StringType)
-  case _ => None
-})
+  /**
+   * Promotes all the way to StringType.
+   */
+  private def stringPromotion: (DataType, DataType) => Option[DataType] = {
--- End diff --

Oh, I will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r100598284
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -101,13 +101,13 @@ object TypeCoercion {
 case _ => None
   }
 
-  /** Similar to [[findTightestCommonType]], but can promote all the way 
to StringType. */
-  def findTightestCommonTypeToString(left: DataType, right: DataType): 
Option[DataType] = {
-findTightestCommonType(left, right).orElse((left, right) match {
-  case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != 
BooleanType => Some(StringType)
-  case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != 
BooleanType => Some(StringType)
-  case _ => None
-})
+  /**
+   * Promotes all the way to StringType.
+   */
+  private def stringPromotion: (DataType, DataType) => Option[DataType] = {
--- End diff --

this becomes a method that returns a function, is it what we need?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-03 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r99329662
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -101,24 +101,13 @@ object TypeCoercion {
 case _ => None
   }
 
-  /** Similar to [[findTightestCommonType]], but can promote all the way 
to StringType. */
-  def findTightestCommonTypeToString(left: DataType, right: DataType): 
Option[DataType] = {
-findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
-  case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != 
BooleanType => Some(StringType)
-  case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != 
BooleanType => Some(StringType)
-  case _ => None
-})
-  }
-
   /**
-   * Find the tightest common type of a set of types by continuously 
applying
-   * `findTightestCommonTypeOfTwo` on these types.
+   * Promotes all the way to StringType.
*/
-  private def findTightestCommonType(types: Seq[DataType]): 
Option[DataType] = {
--- End diff --

Sure, I submitted for this in https://github.com/apache/spark/pull/16786.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-02 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r99283834
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -101,24 +101,13 @@ object TypeCoercion {
 case _ => None
   }
 
-  /** Similar to [[findTightestCommonType]], but can promote all the way 
to StringType. */
-  def findTightestCommonTypeToString(left: DataType, right: DataType): 
Option[DataType] = {
-findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
-  case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != 
BooleanType => Some(StringType)
-  case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != 
BooleanType => Some(StringType)
-  case _ => None
-})
-  }
-
   /**
-   * Find the tightest common type of a set of types by continuously 
applying
-   * `findTightestCommonTypeOfTwo` on these types.
+   * Promotes all the way to StringType.
*/
-  private def findTightestCommonType(types: Seq[DataType]): 
Option[DataType] = {
--- End diff --

It becomes harder for reviewers to read this PR. Could you submit a 
separate PR for code cleaning? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-02 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r99093765
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -53,7 +53,8 @@ class TypeCoercionSuite extends PlanTest {
   // | NullType | ByteType | ShortType | IntegerType | 
LongType | DoubleType | FloatType | Dec(10, 2) | BinaryType | BooleanType | 
StringType | DateType | TimestampType | ArrayType  | MapType  | StructType  | 
NullType | CalendarIntervalType | DecimalType(38, 18) | DoubleType  | 
IntegerType  |
   // | CalendarIntervalType | X| X | X   | X   
 | X  | X | X  | X  | X   | X  
| X| X | X  | X| X   | X| 
CalendarIntervalType | X   | X   | X|
   // 
+--+--+---+-+--++---+++-++--+---++--+-+--+--+-+-+--+
-  // Note: ArrayType*, MapType*, StructType* are castable only when the 
internal child types also match; otherwise, not castable
+  // Note: MapType*, StructType* are castable only when the internal child 
types also match; otherwise, not castable.
+  // Note: ArrayType* is castable when the element type is castable 
according to the table.
--- End diff --

It seems we now support implicit cast of `ArrayType` via 
https://issues.apache.org/jira/browse/SPARK-18624.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-02 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16777#discussion_r99093557
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -101,24 +101,13 @@ object TypeCoercion {
 case _ => None
   }
 
-  /** Similar to [[findTightestCommonType]], but can promote all the way 
to StringType. */
-  def findTightestCommonTypeToString(left: DataType, right: DataType): 
Option[DataType] = {
-findTightestCommonTypeOfTwo(left, right).orElse((left, right) match {
-  case (StringType, t2: AtomicType) if t2 != BinaryType && t2 != 
BooleanType => Some(StringType)
-  case (t1: AtomicType, StringType) if t1 != BinaryType && t1 != 
BooleanType => Some(StringType)
-  case _ => None
-})
-  }
-
   /**
-   * Find the tightest common type of a set of types by continuously 
applying
-   * `findTightestCommonTypeOfTwo` on these types.
+   * Promotes all the way to StringType.
*/
-  private def findTightestCommonType(types: Seq[DataType]): 
Option[DataType] = {
--- End diff --

`findTightestCommonType` was removed as it seems used nowhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

2017-02-02 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/16777

[SPARK-19435][SQL] Type coercion between ArrayTypes

## What changes were proposed in this pull request?

This PR proposes to support type coercion between `ArrayType`s where the 
element types are compatible.

**Before**

```
Seq(Array(1)).toDF("a").selectExpr("greatest(a, array(1D))")
org.apache.spark.sql.AnalysisException: cannot resolve 'greatest(`a`, 
array(1.0D))' due to data type mismatch: The expressions should all have the 
same type, got GREATEST(array, array).; line 1 pos 0;

Seq(Array(1)).toDF("a").selectExpr("least(a, array(1D))")
org.apache.spark.sql.AnalysisException: cannot resolve 'least(`a`, 
array(1.0D))' due to data type mismatch: The expressions should all have the 
same type, got LEAST(array, array).; line 1 pos 0;

sql("SELECT * FROM values (array(0)), (array(1D)) as data(a)")
org.apache.spark.sql.AnalysisException: incompatible types found in column 
a for inline table; line 1 pos 14

Seq(Array(1)).toDF("a").union(Seq(Array(1D)).toDF("b"))
org.apache.spark.sql.AnalysisException: Union can only be performed on 
tables with the compatible column types. ArrayType(DoubleType,false) <> 
ArrayType(IntegerType,false) at the first column of the second table;;

sql("SELECT IF(1=1, array(1), array(1D))")
org.apache.spark.sql.AnalysisException: cannot resolve '(IF((1 = 1), 
array(1), array(1.0D)))' due to data type mismatch: differing types in '(IF((1 
= 1), array(1), array(1.0D)))' (array and array).; line 1 pos 7;
```

**After**

```scala
Seq(Array(1)).toDF("a").selectExpr("greatest(a, array(1D))")
res5: org.apache.spark.sql.DataFrame = [greatest(a, array(1.0)): 
array]

Seq(Array(1)).toDF("a").selectExpr("least(a, array(1D))")
res6: org.apache.spark.sql.DataFrame = [least(a, array(1.0)): array]

sql("SELECT * FROM values (array(0)), (array(1D)) as data(a)")
res8: org.apache.spark.sql.DataFrame = [a: array]

Seq(Array(1)).toDF("a").union(Seq(Array(1D)).toDF("b"))
res10: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [a: 
array]

sql("SELECT IF(1=1, array(1), array(1D))")
res15: org.apache.spark.sql.DataFrame = [(IF((1 = 1), array(1), 
array(1.0))): array]
```

## How was this patch tested?

Unit tests in `TypeCoercion` and Jenkins tests and

building with scala 2.10

```scala
./dev/change-scala-version.sh 2.10
./build/mvn -Pyarn -Phadoop-2.4 -Dscala-2.10 -DskipTests clean package
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-19435

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16777.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16777


commit bd0d9f790821d8cb9caf77508e6c658fb56222ab
Author: hyukjinkwon 
Date:   2017-02-02T10:21:07Z

Type coercion between ArrayTypes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

[GitHub] spark pull request #16777: [SPARK-19435][SQL] Type coercion between ArrayTyp...

25 matches

Site Navigation

Mail list logo

Footer information