[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-12 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at:

https://github.com/apache/spark/pull/14546


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r74353994
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1257,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved &&
+  sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty) =>
+// If there exists ordinal sort orders, it's not resolved 
completely yet. See SPARK-16955.
--- End diff --

Normally, we put the comment above the statement.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r74353789
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -731,6 +731,11 @@ class Analyzer(
 }
 Sort(newOrders, global, child)
 
+  // Ignore the position numbers by removing
--- End diff --

Eliminate the useless position numbers?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r74353698
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -731,6 +731,11 @@ class Analyzer(
 }
 Sort(newOrders, global, child)
 
+  // Ignore the position numbers by removing
--- End diff --

`remove` is a transitive verb. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r74215625
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved &&
+  (conf.orderByOrdinal && sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty)) =>
--- End diff --

Aha, I see what I missed. You're right. I will fix like that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-10 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r74214115
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved &&
+  (conf.orderByOrdinal && sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty)) =>
--- End diff --

I have the feeling that this guard is wrong. This disables this entire 
clause if `conf.orderByOrdinal` is `false`. Shouldn't it be: 
`!conf.orderByOrdinal || sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r73999422
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate)
+if aggregate.resolved && sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty) =>
--- End diff --

For the `false` case, you meant to check `ResolveAggregateFunctions` 
functionality, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r73998920
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate)
+if aggregate.resolved && sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty) =>
--- End diff --

Yep. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r73998693
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate)
+if aggregate.resolved && sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty) =>
--- End diff --

We have a conf `conf.orderByOrdinal` to control whether the integer values 
are parsed as position. Thus, the current fix ignores this conf. Could you fix 
it? Also added a test case to ensure both options are covered. That is, `true` 
and `false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r73997535
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,9 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate)
+if aggregate.resolved && sortOrder.forall(x => 
IntegerIndex.unapply(x.child).isEmpty) =>
--- End diff --

Have you tried to move this extra check into `resolved` of `Aggregate` 
operator? Does it work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r73994581
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,8 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate)
--- End diff --

Thank you, @gatorsmile .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14546#discussion_r73990474
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1252,7 +1252,8 @@ class Analyzer(
   case ae: AnalysisException => filter
 }
 
-  case sort @ Sort(sortOrder, global, aggregate: Aggregate) if 
aggregate.resolved =>
+  case sort @ Sort(sortOrder, global, aggregate: Aggregate)
--- End diff --

Please add a comment here. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...

2016-08-08 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/14546

[SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP BY causes an 
analysis error

## What changes were proposed in this pull request?

Spark supports `ordinal` in GROUP BY and ORDER BY. However, if we use both 
at the same time, it causes exceptions. This PR fixes 
`ResolveAggregateFunctions` rule to handle ordinals, too.

```scala
scala> sql("select a, count(*) from (select 1 as a) tmp group by 1 order by 
1")
org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to 
Group by position: `1` exceeds the size of the select list `0`
```

## How was this patch tested?

Pass the Jenkins with new test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-16955-ORDINAL

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14546.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14546


commit 12f24dc6ab5e32d78f770841176a0a58ecdb3e3f
Author: Dongjoon Hyun 
Date:   2016-08-08T21:40:53Z

[SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP BY causes an 
analysis error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org