[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11555 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-195198885 I'm going to merge this PR as it blocks my next work. cc @liancheng I'll address your comments in follow-up PRs if you have any. And thanks @gatorsmile for your review! I'll send a PR to fix the fundamental issue and we can keep discussing there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-195122428 Got the offline inputs from @ioana-delaney. > Using subqueries is not common, and it is only used if runtime doesn't support a certain sequence of operations. > Internally, when projecting columns with the same name coming from different tables, we can use aliases to distinguish among them. That should be the default behavior irrespective of any further optimizations that can be applied to the generated SQL. Basically, I think we can safely merge this PR. Fix the naming ambiguity issues in a separate PR. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-195089117 The generated alias names will have different column names. To keep the original column names, we need another top Project to convert their names back. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-195086618 When adding an extra Subquery, we always detect if duplicate names exist. If found one, how about adding another Project with unique Alias names for the columns with duplicate names? BTW, I am still waiting for the inputs from RDBMS experts. Will keep you posted. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194683067 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194683068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52813/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194682902 **[Test build #52813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52813/consoleFull)** for PR 11555 at commit [`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194653055 **[Test build #52813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52813/consoleFull)** for PR 11555 at commit [`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194652810 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194607599 Yeah, this is a fundamental issue. I am afraid we are unable to add any extra subqueries for SQL generation. I will check whether SQL generation in traditional RDBMS is also using subqueries. Will post the answer I got in this PR. BTW, I am fine to merge this at first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194598357 hmmm, it not quite related to SPARK-13720, but a fundamental bug of the SQL builder infrastructure. How about we merge this PR first and fix it later? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194596689 Ah, makes sense, thanks for the explanation! I think we need a better fix for SPARK-13720, let me send a separate PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194322107 For example, the following query ```scala sqlContext .range(10) .select('id as 'key, 'id as 'value) .write .saveAsTable("test1") sqlContext .range(10) .select('id as 'key, 'id as 'value) .write .saveAsTable("test2") sql("SELECT sum(a.value) over (ORDER BY a.key), sum(b.value) over (ORDER BY b.key) FROM test1 a JOIN test2 b ON a.key = b.key").explain(true) ``` The plan will be like ``` +- Project [value#29L,key#28L,value#31L,key#30L,windowexpression(sum(value), windowspecdefinition(sortorder(key)))#35L,windowexpression(sum(value), windowspecdefinition(sortorder(key)))#36L,windowexpression(sum(value), windowspecdefinition(sortorder(key)))#35L,windowexpression(sum(value), windowspecdefinition(sortorder(key)))#36L] +- Window [value#29L,key#28L,value#31L,key#30L,windowexpression(sum(value), windowspecdefinition(sortorder(key)))#35L], [(sum(value#31L),mode=Complete,isDistinct=false) windowspecdefinition(key#30L ASC, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS windowexpression(sum(value), windowspecdefinition(sortorder(key)))#36L], [key#30L ASC] +- Window [value#29L,key#28L,value#31L,key#30L], [(sum(value#29L),mode=Complete,isDistinct=false) windowspecdefinition(key#28L ASC, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS windowexpression(sum(value), windowspecdefinition(sortorder(key)))#35L], [key#28L ASC] +- Project [value#29L,key#28L,value#31L,key#30L] +- Join Inner, Some((key#28L = key#30L)) :- SubqueryAlias a : +- SubqueryAlias test1 : +- Relation[key#28L,value#29L] ParquetRelation +- SubqueryAlias b +- SubqueryAlias test2 +- Relation[key#30L,value#31L] ParquetRelation ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194204996 @gatorsmile , can you give a more detailed example? where does the `t` come from? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194188133 For example, given the following sub-plan: ``` Project a.key, b.key Join ``` Assuming we still have a multiple operators above this sub-plan and these operators are using both `a.key` and `b.key`, we will hit an issue if we add extra subquery. In SQL generation, both of them will be `t.key` and `t.key`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194184747 >> Now, if we just replace it by the identical subquery name, they will lose the original qualifiers. I think it's not true. Every added subquery will have a unique name, so we won't have same qualifiers from left and right child of a `Join`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194183615 BTW, we are having another related discussion in the JIRA: https://issues.apache.org/jira/browse/SPARK-13393. Not sure if you are interested in this. Please feel free to jump in, if you have better ideas. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194181148 @dilipbiswal and I just had an offline discussion about this. Sorry, to mention this at the last minute. Adding extra subqueries could be a big issue if the column names are the same but the original qualifier are different. For example, we can join two tables which have the same column names. Normally, we use different qualifier names to differentiate them. Now, if we just replace it by the identical subquery name, they will lose the original qualifiers. Then, the generated SQL statement will be rejected by the Analyzer due to name ambiguity. We are facing this issue in multiple SQL generation cases. Please correct us if our understanding is wrong. Thanks! @cloud-fan @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194174998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52728/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194174991 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194174404 **[Test build #52728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52728/consoleFull)** for PR 11555 at commit [`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194139422 **[Test build #52728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52728/consoleFull)** for PR 11555 at commit [`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194122022 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52720/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194122020 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194121910 **[Test build #52720 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52720/consoleFull)** for PR 11555 at commit [`aa0a32b`](https://github.com/apache/spark/commit/aa0a32b30149620978fbdd26485f01982baa6531). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194114531 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194114532 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52718/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194113999 **[Test build #52718 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52718/consoleFull)** for PR 11555 at commit [`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194107200 **[Test build #52720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52720/consoleFull)** for PR 11555 at commit [`aa0a32b`](https://github.com/apache/spark/commit/aa0a32b30149620978fbdd26485f01982baa6531). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194094117 **[Test build #52718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52718/consoleFull)** for PR 11555 at commit [`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55445617 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi | _: Sample ) => plan -case plan: Project => - wrapChildWithSubquery(plan) +case plan: Project => wrapChildWithSubquery(plan) + +case w @ Window(_, _, _, _, + _: SubqueryAlias +| _: Filter +| _: Join +| _: MetastoreRelation +| OneRowRelation +| _: LocalLimit +| _: GlobalLimit +| _: Sample +) => w + +case w: Window => wrapChildWithSubquery(w) } - def wrapChildWithSubquery(project: Project): Project = project match { -case Project(projectList, child) => - val alias = SQLBuilder.newSubqueryName - val childAttributes = child.outputSet - val aliasedProjectList = projectList.map(_.transform { -case a: Attribute if childAttributes.contains(a) => - a.withQualifiers(alias :: Nil) - }.asInstanceOf[NamedExpression]) + private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = { +val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, plan.child) +plan.withNewChildren(Seq(newChild)) + } +} - Project(aliasedProjectList, SubqueryAlias(alias, child)) +object UpdateQualifiers extends Rule[LogicalPlan] { + override def apply(tree: LogicalPlan): LogicalPlan = tree transformUp { +case plan => + val inputAttributes = plan.children.flatMap(_.output) + plan transformExpressions { +case a: AttributeReference if !plan.producedAttributes.contains(a) => --- End diff -- Yeah, but we do not need to add qualifiers for the attributes in `producedAttributes`. Thus, we keep them untouched. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55444644 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -297,22 +299,34 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi ) } + private def windowToSQL(w: Window): String = { +build( + "SELECT", + (w.child.output ++ w.windowExpressions).map(_.sql).mkString(", "), + if (w.child == OneRowRelation) "" else "FROM", + toSQL(w.child) +) + } + object Canonicalizer extends RuleExecutor[LogicalPlan] { override protected def batches: Seq[Batch] = Seq( - Batch("Canonicalizer", FixedPoint(100), + Batch("Collapse Project", FixedPoint(100), // The `WidenSetOperationTypes` analysis rule may introduce extra `Project`s over // `Aggregate`s to perform type casting. This rule merges these `Project`s into // `Aggregate`s. -CollapseProject, - +CollapseProject), + Batch("Recover Scoping Info", Once, // Used to handle other auxiliary `Project`s added by analyzer (e.g. // `ResolveAggregateFunctions` rule) -RecoverScopingInfo +AddSubquery, +// Previous rule will add extra sub-queries, this rule is used to re-propagate and update +// the qualifiers bottom up. +UpdateQualifiers --- End diff -- Thanks! @cloud-fan Maybe it is good to add an example at here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r5543 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala --- @@ -499,12 +520,25 @@ case class CumeDist() extends RowNumberLike with SizeBasedWindowFunction { case class NTile(buckets: Expression) extends RowNumberLike with SizeBasedWindowFunction { --- End diff -- Ah I see. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55444114 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi | _: Sample ) => plan -case plan: Project => - wrapChildWithSubquery(plan) +case plan: Project => wrapChildWithSubquery(plan) + +case w @ Window(_, _, _, _, + _: SubqueryAlias +| _: Filter +| _: Join +| _: MetastoreRelation +| OneRowRelation +| _: LocalLimit +| _: GlobalLimit +| _: Sample +) => w + +case w: Window => wrapChildWithSubquery(w) } - def wrapChildWithSubquery(project: Project): Project = project match { -case Project(projectList, child) => - val alias = SQLBuilder.newSubqueryName - val childAttributes = child.outputSet - val aliasedProjectList = projectList.map(_.transform { -case a: Attribute if childAttributes.contains(a) => - a.withQualifiers(alias :: Nil) - }.asInstanceOf[NamedExpression]) + private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = { +val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, plan.child) +plan.withNewChildren(Seq(newChild)) + } +} - Project(aliasedProjectList, SubqueryAlias(alias, child)) +object UpdateQualifiers extends Rule[LogicalPlan] { + override def apply(tree: LogicalPlan): LogicalPlan = tree transformUp { +case plan => + val inputAttributes = plan.children.flatMap(_.output) + plan transformExpressions { +case a: AttributeReference if !plan.producedAttributes.contains(a) => --- End diff -- Sounds like `outputSet` should also have that kind of `Attribute`s? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-194006304 Thanks @cloud-fan for working on it! Overall, it looks good. It will be great to have more test cases. Like the following * Multiple window functions are used in a single expression, e.g. `sum(...) OVER (...) / count(...) OVER (...)`. * An expression having regular expression and window functions, e.g. `1 + 2 + Count(...) OVER (...)`. * A regular agg function used with a window function, e.g. `sum(...) - sum(...) OVER (...)`. * `ORDER BY` clauses with `ASC` or `DESC` specified. Also, maybe we are missing some window functions (like `LEAD` and `LAG`)? Supported window functions can be found in https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55443323 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -297,22 +299,34 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi ) } + private def windowToSQL(w: Window): String = { +build( + "SELECT", + (w.child.output ++ w.windowExpressions).map(_.sql).mkString(", "), + if (w.child == OneRowRelation) "" else "FROM", + toSQL(w.child) +) + } + object Canonicalizer extends RuleExecutor[LogicalPlan] { override protected def batches: Seq[Batch] = Seq( - Batch("Canonicalizer", FixedPoint(100), + Batch("Collapse Project", FixedPoint(100), // The `WidenSetOperationTypes` analysis rule may introduce extra `Project`s over // `Aggregate`s to perform type casting. This rule merges these `Project`s into // `Aggregate`s. -CollapseProject, - +CollapseProject), + Batch("Recover Scoping Info", Once, // Used to handle other auxiliary `Project`s added by analyzer (e.g. // `ResolveAggregateFunctions` rule) -RecoverScopingInfo +AddSubquery, +// Previous rule will add extra sub-queries, this rule is used to re-propagate and update +// the qualifiers bottom up. +UpdateQualifiers --- End diff -- https://github.com/cloud-fan/spark/blob/window/sql/hive/src/test/scala/org/apache/spark/sql/hive/LogicalPlanToSQLSuite.scala#L454-L458 The above test is for verifying this rule. The JIRA SPARK-13720 describes the reason why we need to add this rule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55442915 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi | _: Sample ) => plan -case plan: Project => - wrapChildWithSubquery(plan) +case plan: Project => wrapChildWithSubquery(plan) + +case w @ Window(_, _, _, _, + _: SubqueryAlias +| _: Filter +| _: Join +| _: MetastoreRelation +| OneRowRelation +| _: LocalLimit +| _: GlobalLimit +| _: Sample +) => w + +case w: Window => wrapChildWithSubquery(w) } - def wrapChildWithSubquery(project: Project): Project = project match { -case Project(projectList, child) => - val alias = SQLBuilder.newSubqueryName - val childAttributes = child.outputSet - val aliasedProjectList = projectList.map(_.transform { -case a: Attribute if childAttributes.contains(a) => - a.withQualifiers(alias :: Nil) - }.asInstanceOf[NamedExpression]) + private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = { +val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, plan.child) +plan.withNewChildren(Seq(newChild)) + } +} - Project(aliasedProjectList, SubqueryAlias(alias, child)) +object UpdateQualifiers extends Rule[LogicalPlan] { + override def apply(tree: LogicalPlan): LogicalPlan = tree transformUp { +case plan => + val inputAttributes = plan.children.flatMap(_.output) + plan transformExpressions { +case a: AttributeReference if !plan.producedAttributes.contains(a) => --- End diff -- `producedAttributes` is the list of attributes that are added by this operator. For example, `Generate` will produce some attributes that do not exist in the child node. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55442404 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi | _: Sample ) => plan -case plan: Project => - wrapChildWithSubquery(plan) +case plan: Project => wrapChildWithSubquery(plan) + +case w @ Window(_, _, _, _, + _: SubqueryAlias +| _: Filter +| _: Join +| _: MetastoreRelation +| OneRowRelation +| _: LocalLimit +| _: GlobalLimit +| _: Sample +) => w + +case w: Window => wrapChildWithSubquery(w) } - def wrapChildWithSubquery(project: Project): Project = project match { -case Project(projectList, child) => - val alias = SQLBuilder.newSubqueryName - val childAttributes = child.outputSet - val aliasedProjectList = projectList.map(_.transform { -case a: Attribute if childAttributes.contains(a) => - a.withQualifiers(alias :: Nil) - }.asInstanceOf[NamedExpression]) + private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = { +val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, plan.child) +plan.withNewChildren(Seq(newChild)) + } +} - Project(aliasedProjectList, SubqueryAlias(alias, child)) +object UpdateQualifiers extends Rule[LogicalPlan] { + override def apply(tree: LogicalPlan): LogicalPlan = tree transformUp { +case plan => + val inputAttributes = plan.children.flatMap(_.output) + plan transformExpressions { +case a: AttributeReference if !plan.producedAttributes.contains(a) => --- End diff -- @gatorsmile Not related to this PR. What is difference between `producedAttributes` and the `outputSet`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55442518 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala --- @@ -499,12 +520,25 @@ case class CumeDist() extends RowNumberLike with SizeBasedWindowFunction { case class NTile(buckets: Expression) extends RowNumberLike with SizeBasedWindowFunction { --- End diff -- It is defined here: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L200-L203 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55442086 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi | _: Sample ) => plan -case plan: Project => - wrapChildWithSubquery(plan) +case plan: Project => wrapChildWithSubquery(plan) + +case w @ Window(_, _, _, _, + _: SubqueryAlias +| _: Filter +| _: Join +| _: MetastoreRelation +| OneRowRelation +| _: LocalLimit +| _: GlobalLimit +| _: Sample +) => w --- End diff -- Add a comment to explain why we need this rule? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55441966 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -297,22 +299,34 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi ) } + private def windowToSQL(w: Window): String = { +build( + "SELECT", + (w.child.output ++ w.windowExpressions).map(_.sql).mkString(", "), + if (w.child == OneRowRelation) "" else "FROM", + toSQL(w.child) +) + } + object Canonicalizer extends RuleExecutor[LogicalPlan] { override protected def batches: Seq[Batch] = Seq( - Batch("Canonicalizer", FixedPoint(100), + Batch("Collapse Project", FixedPoint(100), // The `WidenSetOperationTypes` analysis rule may introduce extra `Project`s over // `Aggregate`s to perform type casting. This rule merges these `Project`s into // `Aggregate`s. -CollapseProject, - +CollapseProject), + Batch("Recover Scoping Info", Once, // Used to handle other auxiliary `Project`s added by analyzer (e.g. // `ResolveAggregateFunctions` rule) -RecoverScopingInfo +AddSubquery, +// Previous rule will add extra sub-queries, this rule is used to re-propagate and update +// the qualifiers bottom up. +UpdateQualifiers --- End diff -- Which test is for this new rule? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55441699 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala --- @@ -499,12 +520,25 @@ case class CumeDist() extends RowNumberLike with SizeBasedWindowFunction { case class NTile(buckets: Expression) extends RowNumberLike with SizeBasedWindowFunction { --- End diff -- Maybe I missed, where is the method of `sql` for NTile? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193985675 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193815254 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52665/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193814907 **[Test build #52665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52665/consoleFull)** for PR 11555 at commit [`054f50a`](https://github.com/apache/spark/commit/054f50a8661d0d2a20b2924da3815fd13f29568a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193815250 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193756848 **[Test build #52665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52665/consoleFull)** for PR 11555 at commit [`054f50a`](https://github.com/apache/spark/commit/054f50a8661d0d2a20b2924da3815fd13f29568a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193750275 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52664/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193750264 **[Test build #52664 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52664/consoleFull)** for PR 11555 at commit [`fcd60de`](https://github.com/apache/spark/commit/fcd60dec61dbd3ff9fff7b4d141c2938a476e802). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193750272 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193749839 **[Test build #52664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52664/consoleFull)** for PR 11555 at commit [`fcd60de`](https://github.com/apache/spark/commit/fcd60dec61dbd3ff9fff7b4d141c2938a476e802). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193725131 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52649/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193725127 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193724658 **[Test build #52649 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52649/consoleFull)** for PR 11555 at commit [`1ebb3c5`](https://github.com/apache/spark/commit/1ebb3c50ee67b3b25c864e551041caca1f8c5751). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193660913 **[Test build #52649 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52649/consoleFull)** for PR 11555 at commit [`1ebb3c5`](https://github.com/apache/spark/commit/1ebb3c50ee67b3b25c864e551041caca1f8c5751). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193616416 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52626/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193616413 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193616299 **[Test build #52626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52626/consoleFull)** for PR 11555 at commit [`c82229a`](https://github.com/apache/spark/commit/c82229a42efec9131652435b9543df81d1feab6c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193573893 **[Test build #52626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52626/consoleFull)** for PR 11555 at commit [`c82229a`](https://github.com/apache/spark/commit/c82229a42efec9131652435b9543df81d1feab6c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193562698 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193562700 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52616/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193562358 **[Test build #52616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52616/consoleFull)** for PR 11555 at commit [`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193534923 **[Test build #52616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52616/consoleFull)** for PR 11555 at commit [`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193533951 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193374324 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52569/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193374322 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193373981 **[Test build #52569 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52569/consoleFull)** for PR 11555 at commit [`f968a33`](https://github.com/apache/spark/commit/f968a33870cf4b954d454d4ec1935ac97888de42). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193371071 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193371075 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52571/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193370585 **[Test build #52571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52571/consoleFull)** for PR 11555 at commit [`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193339446 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52566/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193339441 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193339120 **[Test build #52566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52566/consoleFull)** for PR 11555 at commit [`276a870`](https://github.com/apache/spark/commit/276a870dee9d150c35220c391d8d41acd463c314). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-19056 **[Test build #52571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52571/consoleFull)** for PR 11555 at commit [`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193329095 **[Test build #52569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52569/consoleFull)** for PR 11555 at commit [`f968a33`](https://github.com/apache/spark/commit/f968a33870cf4b954d454d4ec1935ac97888de42). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55227980 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -204,11 +207,70 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi private def build(segments: String*): String = segments.map(_.trim).filter(_.nonEmpty).mkString(" ") + /** + * Given a seq of qualifiers(names and their corresponding [[AttributeSet]]), transform the given + * expression tree, if an [[Attribute]] belongs to one of the [[AttributeSet]]s, update its + * qualifier with the corresponding name of the [[AttributeSet]]. + */ + private def updateQualifier( + expr: Expression, + qualifiers: Seq[(String, AttributeSet)]): Expression = { +if (qualifiers.isEmpty) { + expr +} else { + expr transform { +case a: Attribute => + val index = qualifiers.indexWhere { +case (_, inputAttributes) => inputAttributes.contains(a) + } + if (index == -1) { +a + } else { +a.withQualifiers(qualifiers(index)._1 :: Nil) + } + } +} + } + + /** + * Finds the outer most [[SubqueryAlias]] nodes in the input logical plan and return their alias + * names and outputSet. + */ + private def findOutermostQualifiers(input: LogicalPlan): Seq[(String, AttributeSet)] = { --- End diff -- This is really a good idea! thanks, updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193291764 **[Test build #52566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52566/consoleFull)** for PR 11555 at commit [`276a870`](https://github.com/apache/spark/commit/276a870dee9d150c35220c391d8d41acd463c314). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55215536 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -204,11 +207,70 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi private def build(segments: String*): String = segments.map(_.trim).filter(_.nonEmpty).mkString(" ") + /** + * Given a seq of qualifiers(names and their corresponding [[AttributeSet]]), transform the given + * expression tree, if an [[Attribute]] belongs to one of the [[AttributeSet]]s, update its + * qualifier with the corresponding name of the [[AttributeSet]]. + */ + private def updateQualifier( + expr: Expression, + qualifiers: Seq[(String, AttributeSet)]): Expression = { +if (qualifiers.isEmpty) { + expr +} else { + expr transform { +case a: Attribute => + val index = qualifiers.indexWhere { +case (_, inputAttributes) => inputAttributes.contains(a) + } + if (index == -1) { +a + } else { +a.withQualifiers(qualifiers(index)._1 :: Nil) + } + } +} + } + + /** + * Finds the outer most [[SubqueryAlias]] nodes in the input logical plan and return their alias + * names and outputSet. + */ + private def findOutermostQualifiers(input: LogicalPlan): Seq[(String, AttributeSet)] = { --- End diff -- Thanks, I like this one :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193277373 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193277378 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52560/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193277070 **[Test build #52560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52560/consoleFull)** for PR 11555 at commit [`40bd17a`](https://github.com/apache/spark/commit/40bd17a3d35b017d9af240da8a40df7e2998f610). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193257580 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52558/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193257578 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193257343 **[Test build #52558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52558/consoleFull)** for PR 11555 at commit [`9a66fbb`](https://github.com/apache/spark/commit/9a66fbb756d78c393d2493dea5a8194bae1d61b5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55205891 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -204,11 +207,70 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi private def build(segments: String*): String = segments.map(_.trim).filter(_.nonEmpty).mkString(" ") + /** + * Given a seq of qualifiers(names and their corresponding [[AttributeSet]]), transform the given + * expression tree, if an [[Attribute]] belongs to one of the [[AttributeSet]]s, update its + * qualifier with the corresponding name of the [[AttributeSet]]. + */ + private def updateQualifier( + expr: Expression, + qualifiers: Seq[(String, AttributeSet)]): Expression = { +if (qualifiers.isEmpty) { + expr +} else { + expr transform { +case a: Attribute => + val index = qualifiers.indexWhere { +case (_, inputAttributes) => inputAttributes.contains(a) + } + if (index == -1) { +a + } else { +a.withQualifiers(qualifiers(index)._1 :: Nil) + } + } +} + } + + /** + * Finds the outer most [[SubqueryAlias]] nodes in the input logical plan and return their alias + * names and outputSet. + */ + private def findOutermostQualifiers(input: LogicalPlan): Seq[(String, AttributeSet)] = { --- End diff -- I have another alternative. We are facing the same issue everywhere when we add an extra Qualifier or remove an extra Qualifier. How about adding another rule/batch below the existing Batch("Canonicalizer") For example, ```scala Batch("Replace Qualifier", Once, ReplaceQualifier) ``` The rule is simple. We always can get the qualifier from the inputSet if we are doing in bottom up traversal. I did not do a full test last night. Below is the code draft: ```scala object ReplaceQualifier extends Rule[LogicalPlan] { override def apply(tree: LogicalPlan): LogicalPlan = tree transformUp { case plan => plan transformExpressions { case e: AttributeReference => e.withQualifiers(getQualifier(plan.inputSet, e)) } } private def getQualifier(inputSet: AttributeSet, e: AttributeReference): Seq[String] = { inputSet.collectFirst { case a if a.semanticEquals(e) => a.qualifiers }.getOrElse(Seq.empty[String]) } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193243578 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52557/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193243571 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193243180 **[Test build #52557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52557/consoleFull)** for PR 11555 at commit [`e037814`](https://github.com/apache/spark/commit/e037814575535a635938b164cf183c7e8a66ea0b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193241782 **[Test build #52560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52560/consoleFull)** for PR 11555 at commit [`40bd17a`](https://github.com/apache/spark/commit/40bd17a3d35b017d9af240da8a40df7e2998f610). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/11555#discussion_r55198266 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala --- @@ -204,11 +208,55 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi private def build(segments: String*): String = segments.map(_.trim).filter(_.nonEmpty).mkString(" ") + private def updateQualifier( + expr: Expression, + qualifiers: Seq[(String, AttributeSet)]): Expression = { +if (qualifiers.isEmpty) { + expr +} else { + expr transform { +case a: Attribute => + val index = qualifiers.indexWhere { +case (_, inputAttributes) => inputAttributes.contains(a) + } + if (index == -1) { +a + } else { +a.withQualifiers(qualifiers(index)._1 :: Nil) + } + } +} + } + + private def findQualifiers(input: LogicalPlan): Seq[(String, AttributeSet)] = { +val results = mutable.ArrayBuffer.empty[(String, AttributeSet)] +val nodes = mutable.Stack(input) + +while (nodes.nonEmpty) { + val node = nodes.pop() + node match { +case SubqueryAlias(alias, child) => results += alias -> child.outputSet +case _ => node.children.foreach(nodes.push) + } +} + +results.toSeq + } --- End diff -- So this method is basically a DFS search for all the outermost `SubqueryAlias` operators. Maybe the following version is clearer: ```scala def findOutermostQualifiers(input: LogicalPlan): Seq[(String, AttributeSet)] = { input.collectFirst { case SubqueryAlias(alias, child) => Seq(alias -> child.outputSet) case plan => plan.children.flatMap(findOutermostQualifiers) }.toSeq.flatten } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193231302 **[Test build #52558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52558/consoleFull)** for PR 11555 at commit [`9a66fbb`](https://github.com/apache/spark/commit/9a66fbb756d78c393d2493dea5a8194bae1d61b5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org