[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13418


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65609442
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
---
@@ -123,6 +123,31 @@ class SubquerySuite extends QueryTest with 
SharedSQLContext {
 )
   }
 
+  test("SPARK-15677: Scalar sub-query in Select list against a DataFrame 
generated query") {
+Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t1")
--- End diff --

please use `withTempTable`(ok the name is wrong for history reasons, it 
should be `withTempView`), which will drop the view after the test for you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65609083
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
---
@@ -123,6 +123,31 @@ class SubquerySuite extends QueryTest with 
SharedSQLContext {
 )
   }
 
+  test("SPARK-15677: Scalar sub-query in Select list against a DataFrame 
generated query") {
--- End diff --

maybe we should mention that this bug only exists in local relation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65594703
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
---
@@ -121,6 +123,16 @@ class SubquerySuite extends QueryTest with 
SharedSQLContext {
 " where key = (select max(key) from subqueryData) - 1)"),
   Array(Row("two"))
 )
+
+checkAnswer(
--- End diff --

I think it's better to create a new test case for it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65484867
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1468,10 +1468,15 @@ object DecimalAggregates extends Rule[LogicalPlan] {
  */
 object ConvertToLocalRelation extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transform {
-case Project(projectList, LocalRelation(output, data)) =>
+case p @ Project(projectList, LocalRelation(output, data))
+if !p.expressions.exists(hasUnevaluableExpr) =>
--- End diff --

`p.expressions` is just the `projectList`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65455758
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
  */
 object ConvertToLocalRelation extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transform {
-case Project(projectList, LocalRelation(output, data)) =>
+case p @ Project(projectList, LocalRelation(output, data))
+if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
--- End diff --

+1 to catch `Unevaluable` and special case `AttributeReference`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65441184
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
  */
 object ConvertToLocalRelation extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transform {
-case Project(projectList, LocalRelation(output, data)) =>
+case p @ Project(projectList, LocalRelation(output, data))
+if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
--- End diff --

I think AttributeReference is the only exception, it will be replaced to 
BoundReference when create an Projection, we could have a special case for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65438111
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
 ---
@@ -84,6 +84,13 @@ object ScalarSubquery {
   case _ => false
 }.isDefined
   }
+
+  def hasScalarSubquery(e: Expression): Boolean = {
+e.find {
--- End diff --

@rxin Thank you for the review. I aligned my code to the existing 
implementation. But I can replace the method call with your suggestion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request:

https://github.com/apache/spark/pull/13418#discussion_r65437967
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
  */
 object ConvertToLocalRelation extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transform {
-case Project(projectList, LocalRelation(output, data)) =>
+case p @ Project(projectList, LocalRelation(output, data))
+if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
--- End diff --

@davies Sorry for the delay in replying. I am new to the Spark code. I've 
looked at Unevaluable expressions. My findings are that checking for 
Unevaluable expressions would be too general since a lot of expressions mix in 
this trait. For example, AttributeReference is one of them. If we explicitly 
check for Unevaluable expressions, a simple query of the form "select c1 from 
t1"
would be regressed. Let me know I misunderstood your requirement. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org