[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/14546 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r74353994 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1257,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved && + sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty) => +// If there exists ordinal sort orders, it's not resolved completely yet. See SPARK-16955. --- End diff -- Normally, we put the comment above the statement. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r74353789 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -731,6 +731,11 @@ class Analyzer( } Sort(newOrders, global, child) + // Ignore the position numbers by removing --- End diff -- Eliminate the useless position numbers? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r74353698 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -731,6 +731,11 @@ class Analyzer( } Sort(newOrders, global, child) + // Ignore the position numbers by removing --- End diff -- `remove` is a transitive verb. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r74215625 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved && + (conf.orderByOrdinal && sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty)) => --- End diff -- Aha, I see what I missed. You're right. I will fix like that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r74214115 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved && + (conf.orderByOrdinal && sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty)) => --- End diff -- I have the feeling that this guard is wrong. This disables this entire clause if `conf.orderByOrdinal` is `false`. Shouldn't it be: `!conf.orderByOrdinal || sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r73999422 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) +if aggregate.resolved && sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty) => --- End diff -- For the `false` case, you meant to check `ResolveAggregateFunctions` functionality, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r73998920 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) +if aggregate.resolved && sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty) => --- End diff -- Yep. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r73998693 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) +if aggregate.resolved && sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty) => --- End diff -- We have a conf `conf.orderByOrdinal` to control whether the integer values are parsed as position. Thus, the current fix ignores this conf. Could you fix it? Also added a test case to ensure both options are covered. That is, `true` and `false` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r73997535 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,9 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) +if aggregate.resolved && sortOrder.forall(x => IntegerIndex.unapply(x.child).isEmpty) => --- End diff -- Have you tried to move this extra check into `resolved` of `Aggregate` operator? Does it work? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r73994581 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,8 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) --- End diff -- Thank you, @gatorsmile . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14546#discussion_r73990474 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1252,7 +1252,8 @@ class Analyzer( case ae: AnalysisException => filter } - case sort @ Sort(sortOrder, global, aggregate: Aggregate) if aggregate.resolved => + case sort @ Sort(sortOrder, global, aggregate: Aggregate) --- End diff -- Please add a comment here. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/14546 [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP BY causes an analysis error ## What changes were proposed in this pull request? Spark supports `ordinal` in GROUP BY and ORDER BY. However, if we use both at the same time, it causes exceptions. This PR fixes `ResolveAggregateFunctions` rule to handle ordinals, too. ```scala scala> sql("select a, count(*) from (select 1 as a) tmp group by 1 order by 1") org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to Group by position: `1` exceeds the size of the select list `0` ``` ## How was this patch tested? Pass the Jenkins with new test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-16955-ORDINAL Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14546.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14546 commit 12f24dc6ab5e32d78f770841176a0a58ecdb3e3f Author: Dongjoon Hyun Date: 2016-08-08T21:40:53Z [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP BY causes an analysis error --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org