[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173473258
  
Really appreciate your reviews!!! : ) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173293382
  
The latest merge is for resolving the conflicts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-17298
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173332908
  
**[Test build #49792 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49792/consoleFull)**
 for PR 10577 at commit 
[`c18381e`](https://github.com/apache/spark/commit/c18381e53c41666acc3149072a0c56a62fa556fe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Union(children: Seq[LogicalPlan]) extends LogicalPlan `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-17301
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49792/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173362144
  
@rxin @marmbrus , could you please review the latest changes? Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173391875
  
Thanks - I'm going to merge this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10577


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173384017
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173300904
  
**[Test build #49792 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49792/consoleFull)**
 for PR 10577 at commit 
[`c18381e`](https://github.com/apache/spark/commit/c18381e53c41666acc3149072a0c56a62fa556fe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r50175266
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -219,34 +222,62 @@ object HiveTypeCoercion {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
   case p if p.analyzed => p
 
-  case s @ SetOperation(left, right) if s.childrenResolved
-  && left.output.length == right.output.length && !s.resolved =>
+  case s @ SetOperation(left, right) if s.childrenResolved &&
+  left.output.length == right.output.length && !s.resolved =>
+val newChildren: Seq[LogicalPlan] = 
buildNewChildrenWithWiderTypes(left :: right :: Nil)
+assert(newChildren.length == 2)
+s.makeCopy(Array(newChildren.head, newChildren.last))
 
-// Tracks the list of data types to widen.
-// Some(dataType) means the right-hand side and the left-hand side 
have different types,
-// and there is a target type to widen both sides to.
-val targetTypes: Seq[Option[DataType]] = 
left.output.zip(right.output).map {
-  case (lhs, rhs) if lhs.dataType != rhs.dataType =>
-findWiderTypeForTwo(lhs.dataType, rhs.dataType)
-  case other => None
-}
+  case s: Union if s.childrenResolved &&
+  s.children.forall(_.output.length == 
s.children.head.output.length) && !s.resolved =>
+val newChildren: Seq[LogicalPlan] = 
buildNewChildrenWithWiderTypes(s.children)
+s.makeCopy(Array(newChildren))
+}
 
-if (targetTypes.exists(_.isDefined)) {
-  // There is at least one column to widen.
-  s.makeCopy(Array(widenTypes(left, targetTypes), 
widenTypes(right, targetTypes)))
-} else {
-  // If we cannot find any column to widen, then just return the 
original set.
-  s
-}
+/** Build new children with the widest types for each attribute among 
all the children */
+private def buildNewChildrenWithWiderTypes(children: 
Seq[LogicalPlan]): Seq[LogicalPlan] = {
+  require(children.forall(_.output.length == 
children.head.output.length))
+
+  // Get a sequence of data types, each of which is the widest type of 
this specific attribute
+  // in all the children
+  val targetTypes: Seq[DataType] =
+getWidestTypes(children, attrIndex = 0, mutable.Queue[DataType]())
+
+  if (targetTypes.nonEmpty) {
+// Add an extra Project if the targetTypes are different from the 
original types.
+children.map(widenTypes(_, targetTypes))
+  } else {
+// Unable to find a target type to widen, then just return the 
original set.
+children
+  }
+}
+
+/** Get the widest type for each attribute in all the children */
+@tailrec private def getWidestTypes(
+children: Seq[LogicalPlan],
+attrIndex: Int,
+castedTypes: mutable.Queue[DataType]): Seq[DataType] = {
+  // Return the result after the widen data types have been found for 
all the children
+  if (attrIndex >= children.head.output.length) return 
castedTypes.toSeq
+
+  // For the attrIndex-th attribute, find the widest type
+  findWiderCommonType(children.map(_.output(attrIndex).dataType)) 
match {
+// If unable to find an appropriate widen type for this column, 
return an empty Seq
+case None => Seq.empty[DataType]
+// Otherwise, record the result in the queue and find the type for 
the next column
+case Some(widenType) =>
+  castedTypes.enqueue(widenType)
+  getWidestTypes(children, attrIndex + 1, castedTypes)
+  }
 }
 
 /** Given a plan, add an extra project on top to widen some columns' 
data types. */
-private def widenTypes(plan: LogicalPlan, targetTypes: 
Seq[Option[DataType]]): LogicalPlan = {
+private def widenTypes(plan: LogicalPlan, targetTypes: Seq[DataType]): 
LogicalPlan = {
   val casted = plan.output.zip(targetTypes).map {
-case (e, Some(dt)) if e.dataType != dt => Alias(Cast(e, dt), 
e.name)()
+case (e, dt) if e.dataType != dt => Alias(Cast(e, dt), e.name)()
 case (e, _) => e
   }
-  Project(casted, plan)
+  if (casted.exists(_.isInstanceOf[Alias])) Project(casted, plan) else 
plan
--- End diff --

Sure, let me remove it. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please

[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172940341
  
LGTM except one minor comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r50150694
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -219,34 +222,62 @@ object HiveTypeCoercion {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
   case p if p.analyzed => p
 
-  case s @ SetOperation(left, right) if s.childrenResolved
-  && left.output.length == right.output.length && !s.resolved =>
+  case s @ SetOperation(left, right) if s.childrenResolved &&
+  left.output.length == right.output.length && !s.resolved =>
+val newChildren: Seq[LogicalPlan] = 
buildNewChildrenWithWiderTypes(left :: right :: Nil)
+assert(newChildren.length == 2)
+s.makeCopy(Array(newChildren.head, newChildren.last))
 
-// Tracks the list of data types to widen.
-// Some(dataType) means the right-hand side and the left-hand side 
have different types,
-// and there is a target type to widen both sides to.
-val targetTypes: Seq[Option[DataType]] = 
left.output.zip(right.output).map {
-  case (lhs, rhs) if lhs.dataType != rhs.dataType =>
-findWiderTypeForTwo(lhs.dataType, rhs.dataType)
-  case other => None
-}
+  case s: Union if s.childrenResolved &&
+  s.children.forall(_.output.length == 
s.children.head.output.length) && !s.resolved =>
+val newChildren: Seq[LogicalPlan] = 
buildNewChildrenWithWiderTypes(s.children)
+s.makeCopy(Array(newChildren))
+}
 
-if (targetTypes.exists(_.isDefined)) {
-  // There is at least one column to widen.
-  s.makeCopy(Array(widenTypes(left, targetTypes), 
widenTypes(right, targetTypes)))
-} else {
-  // If we cannot find any column to widen, then just return the 
original set.
-  s
-}
+/** Build new children with the widest types for each attribute among 
all the children */
+private def buildNewChildrenWithWiderTypes(children: 
Seq[LogicalPlan]): Seq[LogicalPlan] = {
+  require(children.forall(_.output.length == 
children.head.output.length))
+
+  // Get a sequence of data types, each of which is the widest type of 
this specific attribute
+  // in all the children
+  val targetTypes: Seq[DataType] =
+getWidestTypes(children, attrIndex = 0, mutable.Queue[DataType]())
+
+  if (targetTypes.nonEmpty) {
+// Add an extra Project if the targetTypes are different from the 
original types.
+children.map(widenTypes(_, targetTypes))
+  } else {
+// Unable to find a target type to widen, then just return the 
original set.
+children
+  }
+}
+
+/** Get the widest type for each attribute in all the children */
+@tailrec private def getWidestTypes(
+children: Seq[LogicalPlan],
+attrIndex: Int,
+castedTypes: mutable.Queue[DataType]): Seq[DataType] = {
+  // Return the result after the widen data types have been found for 
all the children
+  if (attrIndex >= children.head.output.length) return 
castedTypes.toSeq
+
+  // For the attrIndex-th attribute, find the widest type
+  findWiderCommonType(children.map(_.output(attrIndex).dataType)) 
match {
+// If unable to find an appropriate widen type for this column, 
return an empty Seq
+case None => Seq.empty[DataType]
+// Otherwise, record the result in the queue and find the type for 
the next column
+case Some(widenType) =>
+  castedTypes.enqueue(widenType)
+  getWidestTypes(children, attrIndex + 1, castedTypes)
+  }
 }
 
 /** Given a plan, add an extra project on top to widen some columns' 
data types. */
-private def widenTypes(plan: LogicalPlan, targetTypes: 
Seq[Option[DataType]]): LogicalPlan = {
+private def widenTypes(plan: LogicalPlan, targetTypes: Seq[DataType]): 
LogicalPlan = {
   val casted = plan.output.zip(targetTypes).map {
-case (e, Some(dt)) if e.dataType != dt => Alias(Cast(e, dt), 
e.name)()
+case (e, dt) if e.dataType != dt => Alias(Cast(e, dt), e.name)()
 case (e, _) => e
   }
-  Project(casted, plan)
+  if (casted.exists(_.isInstanceOf[Alias])) Project(casted, plan) else 
plan
--- End diff --

no need to do this optimization, the `Optimizer` is smart enough to remove 
this unnecessary `Projection`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173021608
  
**[Test build #49722 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49722/consoleFull)**
 for PR 10577 at commit 
[`c63f237`](https://github.com/apache/spark/commit/c63f237b6de61c700cec3544820674c1dcb94334).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173036085
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49722/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173036081
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173073042
  
The test failure was caused by a PR, which has been reverted. 

retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173035940
  
**[Test build #49722 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49722/consoleFull)**
 for PR 10577 at commit 
[`c63f237`](https://github.com/apache/spark/commit/c63f237b6de61c700cec3544820674c1dcb94334).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173093961
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173094840
  
**[Test build #49760 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49760/consoleFull)**
 for PR 10577 at commit 
[`c63f237`](https://github.com/apache/spark/commit/c63f237b6de61c700cec3544820674c1dcb94334).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173116680
  
**[Test build #49760 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49760/consoleFull)**
 for PR 10577 at commit 
[`c63f237`](https://github.com/apache/spark/commit/c63f237b6de61c700cec3544820674c1dcb94334).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173116859
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49760/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-173116857
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r50055366
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -27,6 +30,7 @@ import org.apache.spark.sql.types._
 
 
 /**
+<<< HEAD
--- End diff --

need to resolve this conflict


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r50059783
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -27,6 +30,7 @@ import org.apache.spark.sql.types._
 
 
 /**
+<<< HEAD
--- End diff --

Yeah... Forgot it. Done! : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172696176
  
**[Test build #49634 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49634/consoleFull)**
 for PR 10577 at commit 
[`f112026`](https://github.com/apache/spark/commit/f11202646e2686ce1b60608d941c2baf4ac12572).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Union(children: Seq[LogicalPlan]) extends LogicalPlan `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172709381
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49644/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172678732
  
@rxin Thank you for your trust! I am not sure if my new changes follow the 
same way you did. It is a little bit different if we have more than two 
children. Please let me know if I can simplify it further. Thanks! 

@cloud-fan Maybe you need to review it again. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172680219
  
**[Test build #49634 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49634/consoleFull)**
 for PR 10577 at commit 
[`f112026`](https://github.com/apache/spark/commit/f11202646e2686ce1b60608d941c2baf4ac12572).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172691276
  
**[Test build #49644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49644/consoleFull)**
 for PR 10577 at commit 
[`4f71741`](https://github.com/apache/spark/commit/4f717410527094cfe028492e82bcad2fe749ee8d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172696361
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172696362
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49634/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172708908
  
**[Test build #49644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49644/consoleFull)**
 for PR 10577 at commit 
[`4f71741`](https://github.com/apache/spark/commit/4f717410527094cfe028492e82bcad2fe749ee8d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172709376
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-17 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172453483
  
@rxin @marmbrus Could you check if this fix is OK? Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172176512
  
**[Test build #49525 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49525/consoleFull)**
 for PR 10577 at commit 
[`3041864`](https://github.com/apache/spark/commit/3041864da350b862bd9dc04565572e283d295a5b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172173700
  
Sorry, just replied the kind of `hidden` comment. See the answer just after 
the comment. : ) 

Will first update the codes to address your latest two comments. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172182176
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172182177
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49525/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172234334
  
**[Test build #49528 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49528/consoleFull)**
 for PR 10577 at commit 
[`3041864`](https://github.com/apache/spark/commit/3041864da350b862bd9dc04565572e283d295a5b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172217405
  
retest this please. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172240480
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49934992
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
@@ -393,8 +393,8 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
 overwrite)
   }
 
-  // If there are multiple INSERTS just UNION them together into on 
query.
-  val query = queries.reduceLeft(Union)
+  // If there are multiple INSERTS just UNION them together into one 
query.
+  val query = if (queries.length == 1) queries.head else Union(queries)
--- End diff --

ah makes sense, let's keep it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172293472
  
Thank you! @cloud-fan Have a good long weekend! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172218228
  
**[Test build #49528 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49528/consoleFull)**
 for PR 10577 at commit 
[`3041864`](https://github.com/apache/spark/commit/3041864da350b862bd9dc04565572e283d295a5b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172234459
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49528/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172234454
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49912654
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,28 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(mutable.Stack(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plans: mutable.Stack[LogicalPlan],
+  children: Seq[LogicalPlan]): Seq[LogicalPlan] = {
+if (plans.isEmpty) children
+else {
+  plans.pop match {
+case Union(grandchildren) =>
+  grandchildren.reverseMap(plans.push(_))
--- End diff --

Using a stack is for doing a depth-first tree traversal. For example, the 
users might expect the order of unions in the following two cases should be the 
same? Or they might not care it? 
```
val unionQuery1 = Union(Union(testRelation, testRelation2), 
testRelation)
val unionQuery2 = Union(testRelation, Union(testRelation2, 
testRelation))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49912748
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -200,41 +203,70 @@ object HiveTypeCoercion {
*/
   object WidenSetOperationTypes extends Rule[LogicalPlan] {
 
-private[this] def widenOutputTypes(
-planName: String,
-left: LogicalPlan,
-right: LogicalPlan): (LogicalPlan, LogicalPlan) = {
-  require(left.output.length == right.output.length)
+private def widenOutputTypes(children: Seq[LogicalPlan]): 
Seq[LogicalPlan] = {
+  require(children.forall(_.output.length == 
children.head.output.length))
 
-  val castedTypes = left.output.zip(right.output).map {
-case (lhs, rhs) if lhs.dataType != rhs.dataType =>
-  findWiderTypeForTwo(lhs.dataType, rhs.dataType)
-case other => None
-  }
+  // Get a sequence of data types, each of which is the widest type of 
this specific attribute
+  // in all the children
+  val castedTypes: Seq[DataType] =
+getCastedTypes(children, attrIndex = 0, mutable.Queue[DataType]())
 
-  def castOutput(plan: LogicalPlan): LogicalPlan = {
-val casted = plan.output.zip(castedTypes).map {
-  case (e, Some(dt)) if e.dataType != dt =>
-Alias(Cast(e, dt), e.name)()
-  case (e, _) => e
-}
-Project(casted, plan)
+  // Add extra Project for type promotion if necessary
+  if (castedTypes.isEmpty) children else children.map(castOutput(_, 
castedTypes))
+}
+
+// Add Project if the data types do not match
+private def castOutput(
+plan: LogicalPlan,
+castedTypes: Seq[DataType]): LogicalPlan = {
+  val casted = plan.output.zip(castedTypes).map {
+case (e, dt) if e.dataType != dt =>
+  Alias(Cast(e, dt), e.name)()
+case (e, _) => e
   }
+  if (casted.exists(_.isInstanceOf[Alias])) Project(casted, plan) else 
plan
+}
 
-  if (castedTypes.exists(_.isDefined)) {
-(castOutput(left), castOutput(right))
-  } else {
-(left, right)
+// Get the widest type for each attribute in all the children
+@tailrec private def getCastedTypes(
+children: Seq[LogicalPlan],
+attrIndex: Int,
+castedTypes: mutable.Queue[DataType]): Seq[DataType] = {
+  // Return the result after the widen data types have been found for 
all the children
+  if (attrIndex >= children.head.output.length) return 
castedTypes.toSeq
+
+  // For the attrIndex-th attribute, find the widest type
+  val initialType = Option(children.head.output(attrIndex).dataType)
+  children.foldLeft(initialType) { (currentOutputDataTypes, child) =>
--- End diff --

Yeah, that is a good idea. Now the code becomes 
`indWiderCommonType(children.map(_.output(attrIndex).dataType))`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49912939
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -129,10 +129,13 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 conditionSQL = condition.sql
   } yield s"$childSQL $whereOrHaving $conditionSQL"
 
-case Union(left, right) =>
+case Union(children) if children.length == 1 =>
+  toSQL(children.head)
--- End diff --

We still need it for handling the last child of union since another case 
for union is dependent on it. Not sure if this is the best way to handle it. I 
added a comment in the code to explain this issue. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49912958
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,40 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+// allChildrenCompatible needs to be evaluated after childrenResolved
+lazy val allChildrenCompatible: Boolean =
--- End diff --

Done. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49912966
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -98,6 +98,20 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   testData.collect().toSeq)
   }
 
+  test ("union all") {
--- End diff --

Done. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172113839
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172113840
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49490/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172114207
  
**[Test build #49492 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49492/consoleFull)**
 for PR 10577 at commit 
[`b88bdeb`](https://github.com/apache/spark/commit/b88bdeb70e3f9a24f1b9136801ebf59a61093dd2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49915485
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -129,12 +129,16 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 conditionSQL = condition.sql
   } yield s"$childSQL $whereOrHaving $conditionSQL"
 
-case Union(left, right) =>
+case Union(children) if children.length > 1 =>
--- End diff --

`SQLBuilder` is run for analyzed plan, so it's guaranteed `Union` has more 
than 1 children right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172134262
  
**[Test build #49492 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49492/consoleFull)**
 for PR 10577 at commit 
[`b88bdeb`](https://github.com/apache/spark/commit/b88bdeb70e3f9a24f1b9136801ebf59a61093dd2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172122102
  
a kind of hidden comment: 
https://github.com/apache/spark/pull/10577/files#r49784836


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172134415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-172134417
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49492/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49915364
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,28 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(mutable.Stack(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plans: mutable.Stack[LogicalPlan],
+  children: Seq[LogicalPlan]): Seq[LogicalPlan] = {
+if (plans.isEmpty) children
+else {
+  plans.pop match {
+case Union(grandchildren) =>
+  grandchildren.reverseMap(plans.push(_))
--- End diff --

order do matters, you are right


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49915984
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -129,12 +129,16 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 conditionSQL = condition.sql
   } yield s"$childSQL $whereOrHaving $conditionSQL"
 
-case Union(left, right) =>
+case Union(children) if children.length > 1 =>
   for {
-leftSQL <- toSQL(left)
-rightSQL <- toSQL(right)
+leftSQL <- toSQL(children.head)
+// When children.tail only has one child, we will go to the next 
case to get rid of Union.
+rightSQL <- toSQL(Union(children.tail))
   } yield s"$leftSQL UNION ALL $rightSQL"
--- End diff --

how about
```
val childrenSql = children.map(toSQL)
if (childrenSql.exists(_.isEmpty)) {
  None
} else {
  childrenSql.map(_.get).mkString("UNION ALL")
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49915065
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -200,41 +203,62 @@ object HiveTypeCoercion {
*/
   object WidenSetOperationTypes extends Rule[LogicalPlan] {
 
-private[this] def widenOutputTypes(
-planName: String,
-left: LogicalPlan,
-right: LogicalPlan): (LogicalPlan, LogicalPlan) = {
-  require(left.output.length == right.output.length)
+private def widenOutputTypes(children: Seq[LogicalPlan]): 
Seq[LogicalPlan] = {
+  require(children.forall(_.output.length == 
children.head.output.length))
 
-  val castedTypes = left.output.zip(right.output).map {
-case (lhs, rhs) if lhs.dataType != rhs.dataType =>
-  findWiderTypeForTwo(lhs.dataType, rhs.dataType)
-case other => None
-  }
+  // Get a sequence of data types, each of which is the widest type of 
this specific attribute
+  // in all the children
+  val castedTypes: Seq[DataType] =
+getCastedTypes(children, attrIndex = 0, mutable.Queue[DataType]())
 
-  def castOutput(plan: LogicalPlan): LogicalPlan = {
-val casted = plan.output.zip(castedTypes).map {
-  case (e, Some(dt)) if e.dataType != dt =>
-Alias(Cast(e, dt), e.name)()
-  case (e, _) => e
-}
-Project(casted, plan)
+  // Add extra Project for type promotion if necessary
+  if (castedTypes.isEmpty) children else children.map(castOutput(_, 
castedTypes))
+}
+
+// Add Project if the data types do not match
+private def castOutput(
+plan: LogicalPlan,
+castedTypes: Seq[DataType]): LogicalPlan = {
+  val casted = plan.output.zip(castedTypes).map {
+case (e, dt) if e.dataType != dt =>
+  Alias(Cast(e, dt), e.name)()
+case (e, _) => e
   }
+  if (casted.exists(_.isInstanceOf[Alias])) Project(casted, plan) else 
plan
+}
 
-  if (castedTypes.exists(_.isDefined)) {
-(castOutput(left), castOutput(right))
-  } else {
-(left, right)
+// Get the widest type for each attribute in all the children
+@tailrec private def getCastedTypes(
+children: Seq[LogicalPlan],
+attrIndex: Int,
+castedTypes: mutable.Queue[DataType]): Seq[DataType] = {
+  // Return the result after the widen data types have been found for 
all the children
+  if (attrIndex >= children.head.output.length) return 
castedTypes.toSeq
+
+  // For the attrIndex-th attribute, find the widest type
+  findWiderCommonType(children.map(_.output(attrIndex).dataType)) 
match {
+// If unable to find an appropriate widen type for this column, 
return an empty Seq
+case None => Seq.empty[DataType]
+// Otherwise, record the result in the queue and find the type for 
the next column
+case Some(widenType) =>
+  castedTypes.enqueue(widenType)
+  getCastedTypes (children, attrIndex + 1, castedTypes)
--- End diff --

nit: no space before `(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-15 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171913477
  
@cloud-fan Really THANK YOU for your review! Will do the changes. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171575054
  
**[Test build #49391 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49391/consoleFull)**
 for PR 10577 at commit 
[`e8e19a1`](https://github.com/apache/spark/commit/e8e19a16eea93806036178746f704573a23ce8a7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Union(children: Seq[LogicalPlan]) extends LogicalPlan `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171575322
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49391/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171575320
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49798629
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,28 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(mutable.Stack(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plans: mutable.Stack[LogicalPlan],
+  children: Seq[LogicalPlan]): Seq[LogicalPlan] = {
+if (plans.isEmpty) children
+else {
+  plans.pop match {
+case Union(grandchildren) =>
+  grandchildren.reverseMap(plans.push(_))
--- End diff --

why not just `foreach`? Does the order of `push` matter here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49799520
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -98,6 +98,20 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   testData.collect().toSeq)
   }
 
+  test ("union all") {
--- End diff --

no space after `test`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49798849
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,28 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(mutable.Stack(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plans: mutable.Stack[LogicalPlan],
+  children: Seq[LogicalPlan]): Seq[LogicalPlan] = {
+if (plans.isEmpty) children
+else {
+  plans.pop match {
+case Union(grandchildren) =>
+  grandchildren.reverseMap(plans.push(_))
--- End diff --

instead of reverse here, why not just use `Queue`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49798994
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,40 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+// allChildrenCompatible needs to be evaluated after childrenResolved
+lazy val allChildrenCompatible: Boolean =
--- End diff --

oh, then let's use `def` here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171810333
  
LGTM except some minor comments, thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49799631
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -129,10 +129,13 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 conditionSQL = condition.sql
   } yield s"$childSQL $whereOrHaving $conditionSQL"
 
-case Union(left, right) =>
+case Union(children) if children.length == 1 =>
+  toSQL(children.head)
--- End diff --

this branch is not needed, `Union` will always have more than one child


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49789952
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1171,6 +1172,15 @@ object EliminateSubQueries extends Rule[LogicalPlan] 
{
 }
 
 /**
+  * Removes [[Union]] operators from the plan if it just has one child.
+  */
+object EliminateUnions extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case Union(children) if children.size == 1 => children.head
--- End diff --

Thats actually only going to match when the `Seq` is explicitly a `List`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49784836
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
@@ -393,8 +393,8 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
 overwrite)
   }
 
-  // If there are multiple INSERTS just UNION them together into on 
query.
-  val query = queries.reduceLeft(Union)
+  // If there are multiple INSERTS just UNION them together into one 
query.
+  val query = if (queries.length == 1) queries.head else Union(queries)
--- End diff --

now we will eliminate one-child `Union` during analysis, so it's ok to just 
return `Union` here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49784995
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1171,6 +1172,15 @@ object EliminateSubQueries extends Rule[LogicalPlan] 
{
 }
 
 /**
+  * Removes [[Union]] operators from the plan if it just has one child.
+  */
+object EliminateUnions extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case Union(children) if children.size == 1 => children.head
--- End diff --

nit: can be `case Union(child :: Nil) => child`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49790507
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -200,41 +203,70 @@ object HiveTypeCoercion {
*/
   object WidenSetOperationTypes extends Rule[LogicalPlan] {
 
-private[this] def widenOutputTypes(
-planName: String,
-left: LogicalPlan,
-right: LogicalPlan): (LogicalPlan, LogicalPlan) = {
-  require(left.output.length == right.output.length)
+private def widenOutputTypes(children: Seq[LogicalPlan]): 
Seq[LogicalPlan] = {
+  require(children.forall(_.output.length == 
children.head.output.length))
 
-  val castedTypes = left.output.zip(right.output).map {
-case (lhs, rhs) if lhs.dataType != rhs.dataType =>
-  findWiderTypeForTwo(lhs.dataType, rhs.dataType)
-case other => None
-  }
+  // Get a sequence of data types, each of which is the widest type of 
this specific attribute
+  // in all the children
+  val castedTypes: Seq[DataType] =
+getCastedTypes(children, attrIndex = 0, mutable.Queue[DataType]())
 
-  def castOutput(plan: LogicalPlan): LogicalPlan = {
-val casted = plan.output.zip(castedTypes).map {
-  case (e, Some(dt)) if e.dataType != dt =>
-Alias(Cast(e, dt), e.name)()
-  case (e, _) => e
-}
-Project(casted, plan)
+  // Add extra Project for type promotion if necessary
+  if (castedTypes.isEmpty) children else children.map(castOutput(_, 
castedTypes))
+}
+
+// Add Project if the data types do not match
+private def castOutput(
+plan: LogicalPlan,
+castedTypes: Seq[DataType]): LogicalPlan = {
+  val casted = plan.output.zip(castedTypes).map {
+case (e, dt) if e.dataType != dt =>
+  Alias(Cast(e, dt), e.name)()
+case (e, _) => e
   }
+  if (casted.exists(_.isInstanceOf[Alias])) Project(casted, plan) else 
plan
+}
 
-  if (castedTypes.exists(_.isDefined)) {
-(castOutput(left), castOutput(right))
-  } else {
-(left, right)
+// Get the widest type for each attribute in all the children
+@tailrec private def getCastedTypes(
+children: Seq[LogicalPlan],
+attrIndex: Int,
+castedTypes: mutable.Queue[DataType]): Seq[DataType] = {
+  // Return the result after the widen data types have been found for 
all the children
+  if (attrIndex >= children.head.output.length) return 
castedTypes.toSeq
+
+  // For the attrIndex-th attribute, find the widest type
+  val initialType = Option(children.head.output(attrIndex).dataType)
+  children.foldLeft(initialType) { (currentOutputDataTypes, child) =>
--- End diff --

is this just `findWiderCommonType(child.output.map(_.dataType))`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49629259
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,39 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+lazy val allChildrenCompatible: Boolean =
+  children.tail.forall( child =>
+// compare the attribute number with the first child
+child.output.length == children.head.output.length &&
+// compare the data types with the first child
+child.output.zip(children.head.output).forall {
+  case (l, r) => l.dataType == r.dataType }
+  )
+
+childrenResolved && allChildrenCompatible
--- End diff --

how about also make sure `children.nonEmpty`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49629751
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,39 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+lazy val allChildrenCompatible: Boolean =
+  children.tail.forall( child =>
+// compare the attribute number with the first child
+child.output.length == children.head.output.length &&
+// compare the data types with the first child
+child.output.zip(children.head.output).forall {
+  case (l, r) => l.dataType == r.dataType }
+  )
+
+childrenResolved && allChildrenCompatible
--- End diff --

Sure, will do it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49630147
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationPushDownSuite.scala
 ---
@@ -30,42 +30,67 @@ class SetOperationPushDownSuite extends PlanTest {
   Batch("Subqueries", Once,
 EliminateSubQueries) ::
   Batch("Union Pushdown", Once,
+CombineUnions,
 SetOperationPushDown,
 SimplifyFilters) :: Nil
   }
 
   val testRelation = LocalRelation('a.int, 'b.int, 'c.int)
   val testRelation2 = LocalRelation('d.int, 'e.int, 'f.int)
-  val testUnion = Union(testRelation, testRelation2)
+  val testRelation3 = LocalRelation('g.int, 'h.int, 'i.int)
+  val testUnion = Union(testRelation :: testRelation2 :: testRelation3 :: 
Nil)
   val testIntersect = Intersect(testRelation, testRelation2)
   val testExcept = Except(testRelation, testRelation2)
 
-  test("union/intersect/except: filter to each side") {
-val unionQuery = testUnion.where('a === 1)
+  test("union: combine unions into one unions") {
--- End diff --

I think it's not kind of push-down test, how about we create a new test 
suite for `Union`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49631656
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,27 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(ArrayBuffer(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plan: ArrayBuffer[LogicalPlan],
--- End diff --

We can just use `push/pop` or `enqueue/dequeue` instead of `tail`, although 
it's not so functional...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49635574
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -98,6 +98,20 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   testData.collect().toSeq)
   }
 
+  test ("union all") {
+val unionDF = testData.unionAll(testData).unionAll(testData)
+  .unionAll(testData).unionAll(testData)
+
+// Before optimizer, Union should be combined.
+assert(unionDF.queryExecution.analyzed.collect {
+  case j @ Union(Seq(_, _, _, _, _)) => j }.size === 1)
--- End diff --

Sure, will change it. Actually, this is for checking if it has five 
children. We can explicitly check the number of children. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49630735
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,27 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(ArrayBuffer(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plan: ArrayBuffer[LogicalPlan],
--- End diff --

Just make sure you pick something that is not expensive to do tail.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49629931
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,27 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(ArrayBuffer(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plan: ArrayBuffer[LogicalPlan],
--- End diff --

using a `Queue` or `Stack` here maybe more efficient?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49630313
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationPushDownSuite.scala
 ---
@@ -30,42 +30,67 @@ class SetOperationPushDownSuite extends PlanTest {
   Batch("Subqueries", Once,
 EliminateSubQueries) ::
   Batch("Union Pushdown", Once,
+CombineUnions,
 SetOperationPushDown,
 SimplifyFilters) :: Nil
   }
 
   val testRelation = LocalRelation('a.int, 'b.int, 'c.int)
   val testRelation2 = LocalRelation('d.int, 'e.int, 'f.int)
-  val testUnion = Union(testRelation, testRelation2)
+  val testRelation3 = LocalRelation('g.int, 'h.int, 'i.int)
+  val testUnion = Union(testRelation :: testRelation2 :: testRelation3 :: 
Nil)
   val testIntersect = Intersect(testRelation, testRelation2)
   val testExcept = Except(testRelation, testRelation2)
 
-  test("union/intersect/except: filter to each side") {
-val unionQuery = testUnion.where('a === 1)
+  test("union: combine unions into one unions") {
--- End diff --

oh, maybe just remove this test suite to `SetOperationSuite`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49630582
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,27 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(ArrayBuffer(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plan: ArrayBuffer[LogicalPlan],
--- End diff --

What I proposed was `Queue`, but the suggestion from @rxin is 
`ArrayBuffer`. Both are OK for me. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49630546
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -98,6 +98,20 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   testData.collect().toSeq)
   }
 
+  test ("union all") {
+val unionDF = testData.unionAll(testData).unionAll(testData)
+  .unionAll(testData).unionAll(testData)
+
+// Before optimizer, Union should be combined.
+assert(unionDF.queryExecution.analyzed.collect {
+  case j @ Union(Seq(_, _, _, _, _)) => j }.size === 1)
--- End diff --

could be just `case j: Union => j`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49630650
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationPushDownSuite.scala
 ---
@@ -30,42 +30,67 @@ class SetOperationPushDownSuite extends PlanTest {
   Batch("Subqueries", Once,
 EliminateSubQueries) ::
   Batch("Union Pushdown", Once,
+CombineUnions,
 SetOperationPushDown,
 SimplifyFilters) :: Nil
   }
 
   val testRelation = LocalRelation('a.int, 'b.int, 'c.int)
   val testRelation2 = LocalRelation('d.int, 'e.int, 'f.int)
-  val testUnion = Union(testRelation, testRelation2)
+  val testRelation3 = LocalRelation('g.int, 'h.int, 'i.int)
+  val testUnion = Union(testRelation :: testRelation2 :: testRelation3 :: 
Nil)
   val testIntersect = Intersect(testRelation, testRelation2)
   val testExcept = Except(testRelation, testRelation2)
 
-  test("union/intersect/except: filter to each side") {
-val unionQuery = testUnion.where('a === 1)
+  test("union: combine unions into one unions") {
--- End diff --

ok, sure, will rename it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49635154
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,27 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(ArrayBuffer(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plan: ArrayBuffer[LogicalPlan],
--- End diff --

Got it. Will do the change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49632415
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -170,17 +173,27 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
   }
 }
 
+
 /**
  * A pattern that collects all adjacent unions and returns their children 
as a Seq.
  */
 object Unions {
   def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
-case u: Union => Some(collectUnionChildren(u))
+case u: Union => Some(collectUnionChildren(ArrayBuffer(u), 
Seq.empty[LogicalPlan]))
 case _ => None
   }
 
-  private def collectUnionChildren(plan: LogicalPlan): Seq[LogicalPlan] = 
plan match {
-case Union(l, r) => collectUnionChildren(l) ++ collectUnionChildren(r)
-case other => other :: Nil
+  @tailrec
+  private def collectUnionChildren(
+  plan: ArrayBuffer[LogicalPlan],
--- End diff --

I think that's fine. Actually most of the time writing functional code for 
the sake of writing functional code is a bad idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49635884
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,39 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+lazy val allChildrenCompatible: Boolean =
+  children.tail.forall( child =>
+// compare the attribute number with the first child
+child.output.length == children.head.output.length &&
+// compare the data types with the first child
+child.output.zip(children.head.output).forall {
+  case (l, r) => l.dataType == r.dataType }
+  )
+
+childrenResolved && allChildrenCompatible
--- End diff --

If we do it at the analysis phase, it can simplify the code. We also can do 
it again at the optimizer phase if we knew some children will return an empty 
set. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49631075
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,39 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+lazy val allChildrenCompatible: Boolean =
+  children.tail.forall( child =>
+// compare the attribute number with the first child
+child.output.length == children.head.output.length &&
+// compare the data types with the first child
+child.output.zip(children.head.output).forall {
+  case (l, r) => l.dataType == r.dataType }
+  )
+
+childrenResolved && allChildrenCompatible
--- End diff --

I think we should ensure `children.length > 1`, it doesn't make sense to 
have a one-child `Union`, we should remove the wrapping union at analysis phase.

cc @marmbrus @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49694420
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,39 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+lazy val allChildrenCompatible: Boolean =
--- End diff --

Tried it. It has to be evaluated after `childrenResolved`. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171560242
  
In this update, the major changes include
  - Added a new analyzer rule `EliminateUnions` for remove the union with a 
single child.
  - Rewrote the function `widenOutputTypes`. Now, it stops if finding the 
first column that is not type-coerce-able among children.
  - Modified the function `collectUnionChildren` by replacing `ArrayBuffer` 
by `Stack`. 
  - Addressed the other comments and fixed the style issues.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171560133
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49389/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171560132
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10577#issuecomment-171560553
  
**[Test build #49391 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49391/consoleFull)**
 for PR 10577 at commit 
[`e8e19a1`](https://github.com/apache/spark/commit/e8e19a16eea93806036178746f704573a23ce8a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49551487
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -123,6 +115,39 @@ case class Except(left: LogicalPlan, right: 
LogicalPlan) extends SetOperation(le
   override def output: Seq[Attribute] = left.output
 }
 
+/** Factory for constructing new `Union` nodes. */
+object Union {
+  def apply(left: LogicalPlan, right: LogicalPlan): Union = {
+Union (left :: right :: Nil)
+  }
+}
+
+case class Union(children: Seq[LogicalPlan]) extends LogicalPlan {
+
+  // updating nullability to make all the children consistent
+  override def output: Seq[Attribute] =
+children.map(_.output).transpose.map(attrs =>
+  attrs.head.withNullability(attrs.exists(_.nullable)))
+
+  override lazy val resolved: Boolean = {
+lazy val allChildrenCompatible: Boolean =
--- End diff --

Sure, will try it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49551474
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -189,6 +189,15 @@ trait CheckAnalysis {
s"but the left table has ${left.output.length} columns and 
the right has " +
s"${right.output.length}")
 
+  case s: Union if s.children.exists(_.output.length != 
s.children.head.output.length) =>
+s.children.filter(_.output.length != 
s.children.head.output.length).exists { child =>
--- End diff --

Thank you, will fix it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12616] [SQL] Making Logical Operator `U...

2016-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10577#discussion_r49551543
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -200,41 +200,60 @@ object HiveTypeCoercion {
*/
   object WidenSetOperationTypes extends Rule[LogicalPlan] {
 
-private[this] def widenOutputTypes(
-planName: String,
-left: LogicalPlan,
-right: LogicalPlan): (LogicalPlan, LogicalPlan) = {
-  require(left.output.length == right.output.length)
-
-  val castedTypes = left.output.zip(right.output).map {
-case (lhs, rhs) if lhs.dataType != rhs.dataType =>
-  findWiderTypeForTwo(lhs.dataType, rhs.dataType)
-case other => None
+private def widenOutputTypes(children: Seq[LogicalPlan]): 
Seq[LogicalPlan] = {
+  require(children.forall(_.output.length == 
children.head.output.length))
+
+  // Get a sequence of data types, each of which is the widest type of 
this specific attribute
+  // in all the children
+  val castedTypes: Seq[Option[DataType]] = {
--- End diff --

Sure, will make the change. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >