subject:"\[GitHub\] spark pull request\: \[SPARK\-12718\]\[SPARK\-13720\]\[SQL\] SQL generation..."

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-10 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11555


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-10 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-195198885
  
I'm going to merge this PR as it blocks my next work. cc @liancheng I'll 
address your comments in follow-up PRs if you have any.
And thanks @gatorsmile for your review! I'll send a PR to fix the 
fundamental issue and we can keep discussing there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-10 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-195122428
  
Got the offline inputs from  @ioana-delaney. 
> Using subqueries is not common, and it is only used if runtime doesn't 
support a certain sequence of operations. 
>  Internally, when projecting columns with the same name coming from 
different tables, we can use aliases to distinguish among them. That should be 
the default behavior irrespective of any further optimizations that can be 
applied to the generated SQL.

Basically, I think we can safely merge this PR. Fix the naming ambiguity 
issues in a separate PR. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-10 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-195089117
  
The generated alias names will have different column names. To keep the 
original column names, we need another top Project to convert their names back. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-10 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-195086618
  
When adding an extra Subquery, we always detect if duplicate names exist. 
If found one, how about adding another Project with unique Alias names for the 
columns with duplicate names?

BTW, I am still waiting for the inputs from RDBMS experts. Will keep you 
posted. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194683067
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194683068
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52813/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194682902
  
**[Test build #52813 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52813/consoleFull)**
 for PR 11555 at commit 
[`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194653055
  
**[Test build #52813 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52813/consoleFull)**
 for PR 11555 at commit 
[`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194652810
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194607599
  
Yeah, this is a fundamental issue. I am afraid we are unable to add any 
extra subqueries for SQL generation. I will check whether SQL generation in 
traditional RDBMS is also using subqueries. Will post the answer I got in this 
PR. 

BTW, I am fine to merge this at first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194598357
  
hmmm, it not quite related to SPARK-13720, but a fundamental bug of the SQL 
builder infrastructure. How about we merge this PR first and fix it later?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194596689
  
Ah, makes sense, thanks for the explanation!
I think we need a better fix for SPARK-13720, let me send a separate PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194322107
  
For example, the following query
```scala
sqlContext
  .range(10)
  .select('id as 'key, 'id as 'value)
  .write
  .saveAsTable("test1")
sqlContext
  .range(10)
  .select('id as 'key, 'id as 'value)
  .write
  .saveAsTable("test2")
sql("SELECT sum(a.value) over (ORDER BY a.key), sum(b.value) over 
(ORDER BY b.key) FROM test1 a JOIN test2 b ON a.key = b.key").explain(true)
```

The plan will be like
```
+- Project 
[value#29L,key#28L,value#31L,key#30L,windowexpression(sum(value), 
windowspecdefinition(sortorder(key)))#35L,windowexpression(sum(value), 
windowspecdefinition(sortorder(key)))#36L,windowexpression(sum(value), 
windowspecdefinition(sortorder(key)))#35L,windowexpression(sum(value), 
windowspecdefinition(sortorder(key)))#36L]
   +- Window 
[value#29L,key#28L,value#31L,key#30L,windowexpression(sum(value), 
windowspecdefinition(sortorder(key)))#35L], 
[(sum(value#31L),mode=Complete,isDistinct=false) windowspecdefinition(key#30L 
ASC, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 
windowexpression(sum(value), windowspecdefinition(sortorder(key)))#36L], 
[key#30L ASC]
  +- Window [value#29L,key#28L,value#31L,key#30L], 
[(sum(value#29L),mode=Complete,isDistinct=false) windowspecdefinition(key#28L 
ASC, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 
windowexpression(sum(value), windowspecdefinition(sortorder(key)))#35L], 
[key#28L ASC]
 +- Project [value#29L,key#28L,value#31L,key#30L]
+- Join Inner, Some((key#28L = key#30L))
   :- SubqueryAlias a
   :  +- SubqueryAlias test1
   : +- Relation[key#28L,value#29L] ParquetRelation
   +- SubqueryAlias b
  +- SubqueryAlias test2
 +- Relation[key#30L,value#31L] ParquetRelation
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194204996
  
@gatorsmile , can you give a more detailed example? where does the `t` come 
from? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194188133
  
For example, given the following sub-plan:
```
Project a.key, b.key 
   Join 
```
Assuming we still have a multiple operators above this sub-plan and these 
operators are using both `a.key` and `b.key`, we will hit an issue if we add 
extra subquery. In SQL generation, both of them will be `t.key` and `t.key`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194184747
  
>> Now, if we just replace it by the identical subquery name, they will 
lose the original qualifiers.

I think it's not true. Every added subquery will have a unique name, so we 
won't have same qualifiers from left and right child of a `Join`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194183615
  
BTW, we are having another related discussion in the JIRA: 
https://issues.apache.org/jira/browse/SPARK-13393. 

Not sure if you are interested in this. Please feel free to jump in, if you 
have better ideas. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194181148
  
@dilipbiswal and I just had an offline discussion about this. Sorry, to 
mention this at the last minute. 

Adding extra subqueries could be a big issue if the column names are the 
same but the original qualifier are different. For example, we can join two 
tables which have the same column names. Normally, we use different qualifier 
names to differentiate them. Now, if we just replace it by the identical 
subquery name, they will lose the original qualifiers. Then, the generated SQL 
statement will be rejected by the Analyzer due to name ambiguity.

We are facing this issue in multiple SQL generation cases. Please correct 
us if our understanding is wrong. Thanks! @cloud-fan @liancheng 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194174998
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52728/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194174991
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-09 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194174404
  
**[Test build #52728 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52728/consoleFull)**
 for PR 11555 at commit 
[`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194139422
  
**[Test build #52728 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52728/consoleFull)**
 for PR 11555 at commit 
[`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194122022
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52720/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194122020
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194121910
  
**[Test build #52720 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52720/consoleFull)**
 for PR 11555 at commit 
[`aa0a32b`](https://github.com/apache/spark/commit/aa0a32b30149620978fbdd26485f01982baa6531).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194114531
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194114532
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52718/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194113999
  
**[Test build #52718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52718/consoleFull)**
 for PR 11555 at commit 
[`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194107200
  
**[Test build #52720 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52720/consoleFull)**
 for PR 11555 at commit 
[`aa0a32b`](https://github.com/apache/spark/commit/aa0a32b30149620978fbdd26485f01982baa6531).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194094117
  
**[Test build #52718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52718/consoleFull)**
 for PR 11555 at commit 
[`dab7a2f`](https://github.com/apache/spark/commit/dab7a2f1a5cc0438405b0fa1cf532ab883bed7e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55445617
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 | _: Sample
 ) => plan
 
-case plan: Project =>
-  wrapChildWithSubquery(plan)
+case plan: Project => wrapChildWithSubquery(plan)
+
+case w @ Window(_, _, _, _,
+  _: SubqueryAlias
+| _: Filter
+| _: Join
+| _: MetastoreRelation
+| OneRowRelation
+| _: LocalLimit
+| _: GlobalLimit
+| _: Sample
+) => w
+
+case w: Window => wrapChildWithSubquery(w)
   }
 
-  def wrapChildWithSubquery(project: Project): Project = project match 
{
-case Project(projectList, child) =>
-  val alias = SQLBuilder.newSubqueryName
-  val childAttributes = child.outputSet
-  val aliasedProjectList = projectList.map(_.transform {
-case a: Attribute if childAttributes.contains(a) =>
-  a.withQualifiers(alias :: Nil)
-  }.asInstanceOf[NamedExpression])
+  private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = {
+val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, 
plan.child)
+plan.withNewChildren(Seq(newChild))
+  }
+}
 
-  Project(aliasedProjectList, SubqueryAlias(alias, child))
+object UpdateQualifiers extends Rule[LogicalPlan] {
+  override def apply(tree: LogicalPlan): LogicalPlan = tree 
transformUp {
+case plan =>
+  val inputAttributes = plan.children.flatMap(_.output)
+  plan transformExpressions {
+case a: AttributeReference if 
!plan.producedAttributes.contains(a) =>
--- End diff --

Yeah, but we do not need to add qualifiers for the attributes in 
`producedAttributes`.  Thus, we keep them untouched.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55444644
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -297,22 +299,34 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 )
   }
 
+  private def windowToSQL(w: Window): String = {
+build(
+  "SELECT",
+  (w.child.output ++ w.windowExpressions).map(_.sql).mkString(", "),
+  if (w.child == OneRowRelation) "" else "FROM",
+  toSQL(w.child)
+)
+  }
+
   object Canonicalizer extends RuleExecutor[LogicalPlan] {
 override protected def batches: Seq[Batch] = Seq(
-  Batch("Canonicalizer", FixedPoint(100),
+  Batch("Collapse Project", FixedPoint(100),
 // The `WidenSetOperationTypes` analysis rule may introduce extra 
`Project`s over
 // `Aggregate`s to perform type casting.  This rule merges these 
`Project`s into
 // `Aggregate`s.
-CollapseProject,
-
+CollapseProject),
+  Batch("Recover Scoping Info", Once,
 // Used to handle other auxiliary `Project`s added by analyzer 
(e.g.
 // `ResolveAggregateFunctions` rule)
-RecoverScopingInfo
+AddSubquery,
+// Previous rule will add extra sub-queries, this rule is used to 
re-propagate and update
+// the qualifiers bottom up.
+UpdateQualifiers
--- End diff --

Thanks!

@cloud-fan Maybe it is good to add an example at here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r5543
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -499,12 +520,25 @@ case class CumeDist() extends RowNumberLike with 
SizeBasedWindowFunction {
 case class NTile(buckets: Expression) extends RowNumberLike with 
SizeBasedWindowFunction {
--- End diff --

Ah I see. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55444114
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 | _: Sample
 ) => plan
 
-case plan: Project =>
-  wrapChildWithSubquery(plan)
+case plan: Project => wrapChildWithSubquery(plan)
+
+case w @ Window(_, _, _, _,
+  _: SubqueryAlias
+| _: Filter
+| _: Join
+| _: MetastoreRelation
+| OneRowRelation
+| _: LocalLimit
+| _: GlobalLimit
+| _: Sample
+) => w
+
+case w: Window => wrapChildWithSubquery(w)
   }
 
-  def wrapChildWithSubquery(project: Project): Project = project match 
{
-case Project(projectList, child) =>
-  val alias = SQLBuilder.newSubqueryName
-  val childAttributes = child.outputSet
-  val aliasedProjectList = projectList.map(_.transform {
-case a: Attribute if childAttributes.contains(a) =>
-  a.withQualifiers(alias :: Nil)
-  }.asInstanceOf[NamedExpression])
+  private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = {
+val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, 
plan.child)
+plan.withNewChildren(Seq(newChild))
+  }
+}
 
-  Project(aliasedProjectList, SubqueryAlias(alias, child))
+object UpdateQualifiers extends Rule[LogicalPlan] {
+  override def apply(tree: LogicalPlan): LogicalPlan = tree 
transformUp {
+case plan =>
+  val inputAttributes = plan.children.flatMap(_.output)
+  plan transformExpressions {
+case a: AttributeReference if 
!plan.producedAttributes.contains(a) =>
--- End diff --

Sounds like `outputSet` should also have that kind of `Attribute`s?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-194006304
  
Thanks @cloud-fan for working on it! Overall, it looks good. It will be 
great to have more test cases. Like the following
* Multiple window functions are used in a single expression, e.g. `sum(...) 
OVER (...) /  count(...) OVER (...)`.
* An expression having regular expression and window functions, e.g. `1 + 2 
+ Count(...) OVER (...)`.
* A regular agg function used with a window function, e.g. `sum(...) - 
sum(...) OVER (...)`.
* `ORDER BY` clauses with `ASC` or `DESC` specified.

Also, maybe we are missing some window functions (like `LEAD` and `LAG`)? 
Supported window functions can be found in 
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html.
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55443323
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -297,22 +299,34 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 )
   }
 
+  private def windowToSQL(w: Window): String = {
+build(
+  "SELECT",
+  (w.child.output ++ w.windowExpressions).map(_.sql).mkString(", "),
+  if (w.child == OneRowRelation) "" else "FROM",
+  toSQL(w.child)
+)
+  }
+
   object Canonicalizer extends RuleExecutor[LogicalPlan] {
 override protected def batches: Seq[Batch] = Seq(
-  Batch("Canonicalizer", FixedPoint(100),
+  Batch("Collapse Project", FixedPoint(100),
 // The `WidenSetOperationTypes` analysis rule may introduce extra 
`Project`s over
 // `Aggregate`s to perform type casting.  This rule merges these 
`Project`s into
 // `Aggregate`s.
-CollapseProject,
-
+CollapseProject),
+  Batch("Recover Scoping Info", Once,
 // Used to handle other auxiliary `Project`s added by analyzer 
(e.g.
 // `ResolveAggregateFunctions` rule)
-RecoverScopingInfo
+AddSubquery,
+// Previous rule will add extra sub-queries, this rule is used to 
re-propagate and update
+// the qualifiers bottom up.
+UpdateQualifiers
--- End diff --


https://github.com/cloud-fan/spark/blob/window/sql/hive/src/test/scala/org/apache/spark/sql/hive/LogicalPlanToSQLSuite.scala#L454-L458

The above test is for verifying this rule. The JIRA SPARK-13720 describes 
the reason why we need to add this rule. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55442915
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 | _: Sample
 ) => plan
 
-case plan: Project =>
-  wrapChildWithSubquery(plan)
+case plan: Project => wrapChildWithSubquery(plan)
+
+case w @ Window(_, _, _, _,
+  _: SubqueryAlias
+| _: Filter
+| _: Join
+| _: MetastoreRelation
+| OneRowRelation
+| _: LocalLimit
+| _: GlobalLimit
+| _: Sample
+) => w
+
+case w: Window => wrapChildWithSubquery(w)
   }
 
-  def wrapChildWithSubquery(project: Project): Project = project match 
{
-case Project(projectList, child) =>
-  val alias = SQLBuilder.newSubqueryName
-  val childAttributes = child.outputSet
-  val aliasedProjectList = projectList.map(_.transform {
-case a: Attribute if childAttributes.contains(a) =>
-  a.withQualifiers(alias :: Nil)
-  }.asInstanceOf[NamedExpression])
+  private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = {
+val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, 
plan.child)
+plan.withNewChildren(Seq(newChild))
+  }
+}
 
-  Project(aliasedProjectList, SubqueryAlias(alias, child))
+object UpdateQualifiers extends Rule[LogicalPlan] {
+  override def apply(tree: LogicalPlan): LogicalPlan = tree 
transformUp {
+case plan =>
+  val inputAttributes = plan.children.flatMap(_.output)
+  plan transformExpressions {
+case a: AttributeReference if 
!plan.producedAttributes.contains(a) =>
--- End diff --

`producedAttributes` is the list of attributes that are added by this 
operator. For example, `Generate` will produce some attributes that do not 
exist in the child node. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55442404
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 | _: Sample
 ) => plan
 
-case plan: Project =>
-  wrapChildWithSubquery(plan)
+case plan: Project => wrapChildWithSubquery(plan)
+
+case w @ Window(_, _, _, _,
+  _: SubqueryAlias
+| _: Filter
+| _: Join
+| _: MetastoreRelation
+| OneRowRelation
+| _: LocalLimit
+| _: GlobalLimit
+| _: Sample
+) => w
+
+case w: Window => wrapChildWithSubquery(w)
   }
 
-  def wrapChildWithSubquery(project: Project): Project = project match 
{
-case Project(projectList, child) =>
-  val alias = SQLBuilder.newSubqueryName
-  val childAttributes = child.outputSet
-  val aliasedProjectList = projectList.map(_.transform {
-case a: Attribute if childAttributes.contains(a) =>
-  a.withQualifiers(alias :: Nil)
-  }.asInstanceOf[NamedExpression])
+  private def wrapChildWithSubquery(plan: UnaryNode): LogicalPlan = {
+val newChild = SubqueryAlias(SQLBuilder.newSubqueryName, 
plan.child)
+plan.withNewChildren(Seq(newChild))
+  }
+}
 
-  Project(aliasedProjectList, SubqueryAlias(alias, child))
+object UpdateQualifiers extends Rule[LogicalPlan] {
+  override def apply(tree: LogicalPlan): LogicalPlan = tree 
transformUp {
+case plan =>
+  val inputAttributes = plan.children.flatMap(_.output)
+  plan transformExpressions {
+case a: AttributeReference if 
!plan.producedAttributes.contains(a) =>
--- End diff --

@gatorsmile Not related to this PR. What is difference between 
`producedAttributes` and the `outputSet`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55442518
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -499,12 +520,25 @@ case class CumeDist() extends RowNumberLike with 
SizeBasedWindowFunction {
 case class NTile(buckets: Expression) extends RowNumberLike with 
SizeBasedWindowFunction {
--- End diff --

It is defined here:

https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L200-L203



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55442086
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -338,20 +353,37 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 | _: Sample
 ) => plan
 
-case plan: Project =>
-  wrapChildWithSubquery(plan)
+case plan: Project => wrapChildWithSubquery(plan)
+
+case w @ Window(_, _, _, _,
+  _: SubqueryAlias
+| _: Filter
+| _: Join
+| _: MetastoreRelation
+| OneRowRelation
+| _: LocalLimit
+| _: GlobalLimit
+| _: Sample
+) => w
--- End diff --

Add a comment to explain why we need this rule?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55441966
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -297,22 +299,34 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
 )
   }
 
+  private def windowToSQL(w: Window): String = {
+build(
+  "SELECT",
+  (w.child.output ++ w.windowExpressions).map(_.sql).mkString(", "),
+  if (w.child == OneRowRelation) "" else "FROM",
+  toSQL(w.child)
+)
+  }
+
   object Canonicalizer extends RuleExecutor[LogicalPlan] {
 override protected def batches: Seq[Batch] = Seq(
-  Batch("Canonicalizer", FixedPoint(100),
+  Batch("Collapse Project", FixedPoint(100),
 // The `WidenSetOperationTypes` analysis rule may introduce extra 
`Project`s over
 // `Aggregate`s to perform type casting.  This rule merges these 
`Project`s into
 // `Aggregate`s.
-CollapseProject,
-
+CollapseProject),
+  Batch("Recover Scoping Info", Once,
 // Used to handle other auxiliary `Project`s added by analyzer 
(e.g.
 // `ResolveAggregateFunctions` rule)
-RecoverScopingInfo
+AddSubquery,
+// Previous rule will add extra sub-queries, this rule is used to 
re-propagate and update
+// the qualifiers bottom up.
+UpdateQualifiers
--- End diff --

Which test is for this new rule?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55441699
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -499,12 +520,25 @@ case class CumeDist() extends RowNumberLike with 
SizeBasedWindowFunction {
 case class NTile(buckets: Expression) extends RowNumberLike with 
SizeBasedWindowFunction {
--- End diff --

Maybe I missed, where is the method of `sql` for NTile?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193985675
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193815254
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52665/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193814907
  
**[Test build #52665 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52665/consoleFull)**
 for PR 11555 at commit 
[`054f50a`](https://github.com/apache/spark/commit/054f50a8661d0d2a20b2924da3815fd13f29568a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193815250
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193756848
  
**[Test build #52665 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52665/consoleFull)**
 for PR 11555 at commit 
[`054f50a`](https://github.com/apache/spark/commit/054f50a8661d0d2a20b2924da3815fd13f29568a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193750275
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52664/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193750264
  
**[Test build #52664 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52664/consoleFull)**
 for PR 11555 at commit 
[`fcd60de`](https://github.com/apache/spark/commit/fcd60dec61dbd3ff9fff7b4d141c2938a476e802).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193750272
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193749839
  
**[Test build #52664 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52664/consoleFull)**
 for PR 11555 at commit 
[`fcd60de`](https://github.com/apache/spark/commit/fcd60dec61dbd3ff9fff7b4d141c2938a476e802).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193725131
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52649/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193725127
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193724658
  
**[Test build #52649 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52649/consoleFull)**
 for PR 11555 at commit 
[`1ebb3c5`](https://github.com/apache/spark/commit/1ebb3c50ee67b3b25c864e551041caca1f8c5751).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193660913
  
**[Test build #52649 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52649/consoleFull)**
 for PR 11555 at commit 
[`1ebb3c5`](https://github.com/apache/spark/commit/1ebb3c50ee67b3b25c864e551041caca1f8c5751).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193616416
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52626/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193616413
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193616299
  
**[Test build #52626 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52626/consoleFull)**
 for PR 11555 at commit 
[`c82229a`](https://github.com/apache/spark/commit/c82229a42efec9131652435b9543df81d1feab6c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193573893
  
**[Test build #52626 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52626/consoleFull)**
 for PR 11555 at commit 
[`c82229a`](https://github.com/apache/spark/commit/c82229a42efec9131652435b9543df81d1feab6c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193562698
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193562700
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52616/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193562358
  
**[Test build #52616 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52616/consoleFull)**
 for PR 11555 at commit 
[`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193534923
  
**[Test build #52616 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52616/consoleFull)**
 for PR 11555 at commit 
[`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193533951
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193374324
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52569/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193374322
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193373981
  
**[Test build #52569 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52569/consoleFull)**
 for PR 11555 at commit 
[`f968a33`](https://github.com/apache/spark/commit/f968a33870cf4b954d454d4ec1935ac97888de42).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193371071
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193371075
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52571/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193370585
  
**[Test build #52571 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52571/consoleFull)**
 for PR 11555 at commit 
[`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193339446
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52566/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193339441
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193339120
  
**[Test build #52566 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52566/consoleFull)**
 for PR 11555 at commit 
[`276a870`](https://github.com/apache/spark/commit/276a870dee9d150c35220c391d8d41acd463c314).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-19056
  
**[Test build #52571 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52571/consoleFull)**
 for PR 11555 at commit 
[`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193329095
  
**[Test build #52569 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52569/consoleFull)**
 for PR 11555 at commit 
[`f968a33`](https://github.com/apache/spark/commit/f968a33870cf4b954d454d4ec1935ac97888de42).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55227980
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -204,11 +207,70 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
   private def build(segments: String*): String =
 segments.map(_.trim).filter(_.nonEmpty).mkString(" ")
 
+  /**
+   * Given a seq of qualifiers(names and their corresponding 
[[AttributeSet]]), transform the given
+   * expression tree, if an [[Attribute]] belongs to one of the 
[[AttributeSet]]s, update its
+   * qualifier with the corresponding name of the [[AttributeSet]].
+   */
+  private def updateQualifier(
+  expr: Expression,
+  qualifiers: Seq[(String, AttributeSet)]): Expression = {
+if (qualifiers.isEmpty) {
+  expr
+} else {
+  expr transform {
+case a: Attribute =>
+  val index = qualifiers.indexWhere {
+case (_, inputAttributes) => inputAttributes.contains(a)
+  }
+  if (index == -1) {
+a
+  } else {
+a.withQualifiers(qualifiers(index)._1 :: Nil)
+  }
+  }
+}
+  }
+
+  /**
+   * Finds the outer most [[SubqueryAlias]] nodes in the input logical 
plan and return their alias
+   * names and outputSet.
+   */
+  private def findOutermostQualifiers(input: LogicalPlan): Seq[(String, 
AttributeSet)] = {
--- End diff --

This is really a good idea! thanks, updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193291764
  
**[Test build #52566 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52566/consoleFull)**
 for PR 11555 at commit 
[`276a870`](https://github.com/apache/spark/commit/276a870dee9d150c35220c391d8d41acd463c314).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55215536
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -204,11 +207,70 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
   private def build(segments: String*): String =
 segments.map(_.trim).filter(_.nonEmpty).mkString(" ")
 
+  /**
+   * Given a seq of qualifiers(names and their corresponding 
[[AttributeSet]]), transform the given
+   * expression tree, if an [[Attribute]] belongs to one of the 
[[AttributeSet]]s, update its
+   * qualifier with the corresponding name of the [[AttributeSet]].
+   */
+  private def updateQualifier(
+  expr: Expression,
+  qualifiers: Seq[(String, AttributeSet)]): Expression = {
+if (qualifiers.isEmpty) {
+  expr
+} else {
+  expr transform {
+case a: Attribute =>
+  val index = qualifiers.indexWhere {
+case (_, inputAttributes) => inputAttributes.contains(a)
+  }
+  if (index == -1) {
+a
+  } else {
+a.withQualifiers(qualifiers(index)._1 :: Nil)
+  }
+  }
+}
+  }
+
+  /**
+   * Finds the outer most [[SubqueryAlias]] nodes in the input logical 
plan and return their alias
+   * names and outputSet.
+   */
+  private def findOutermostQualifiers(input: LogicalPlan): Seq[(String, 
AttributeSet)] = {
--- End diff --

Thanks, I like this one :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193277373
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193277378
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52560/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193277070
  
**[Test build #52560 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52560/consoleFull)**
 for PR 11555 at commit 
[`40bd17a`](https://github.com/apache/spark/commit/40bd17a3d35b017d9af240da8a40df7e2998f610).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193257580
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52558/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193257578
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193257343
  
**[Test build #52558 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52558/consoleFull)**
 for PR 11555 at commit 
[`9a66fbb`](https://github.com/apache/spark/commit/9a66fbb756d78c393d2493dea5a8194bae1d61b5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55205891
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -204,11 +207,70 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
   private def build(segments: String*): String =
 segments.map(_.trim).filter(_.nonEmpty).mkString(" ")
 
+  /**
+   * Given a seq of qualifiers(names and their corresponding 
[[AttributeSet]]), transform the given
+   * expression tree, if an [[Attribute]] belongs to one of the 
[[AttributeSet]]s, update its
+   * qualifier with the corresponding name of the [[AttributeSet]].
+   */
+  private def updateQualifier(
+  expr: Expression,
+  qualifiers: Seq[(String, AttributeSet)]): Expression = {
+if (qualifiers.isEmpty) {
+  expr
+} else {
+  expr transform {
+case a: Attribute =>
+  val index = qualifiers.indexWhere {
+case (_, inputAttributes) => inputAttributes.contains(a)
+  }
+  if (index == -1) {
+a
+  } else {
+a.withQualifiers(qualifiers(index)._1 :: Nil)
+  }
+  }
+}
+  }
+
+  /**
+   * Finds the outer most [[SubqueryAlias]] nodes in the input logical 
plan and return their alias
+   * names and outputSet.
+   */
+  private def findOutermostQualifiers(input: LogicalPlan): Seq[(String, 
AttributeSet)] = {
--- End diff --

I have another alternative. We are facing the same issue everywhere when we 
add an extra Qualifier or remove an extra Qualifier. How about adding another 
rule/batch below the existing Batch("Canonicalizer") For example, 
```scala
  Batch("Replace Qualifier", Once,
ReplaceQualifier)
```
The rule is simple. We always can get the qualifier from the inputSet if we 
are doing in bottom up traversal. I did not do a full test last night. Below is 
the code draft:

```scala
object ReplaceQualifier extends Rule[LogicalPlan] {
  override def apply(tree: LogicalPlan): LogicalPlan = tree transformUp 
{ case plan =>
  plan transformExpressions {
case e: AttributeReference => 
e.withQualifiers(getQualifier(plan.inputSet, e))
  }
  }

  private def getQualifier(inputSet: AttributeSet, e: 
AttributeReference): Seq[String] = {
inputSet.collectFirst {
  case a if a.semanticEquals(e) => a.qualifiers
}.getOrElse(Seq.empty[String])
  }
}
``` 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193243578
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52557/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193243571
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193243180
  
**[Test build #52557 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52557/consoleFull)**
 for PR 11555 at commit 
[`e037814`](https://github.com/apache/spark/commit/e037814575535a635938b164cf183c7e8a66ea0b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193241782
  
**[Test build #52560 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52560/consoleFull)**
 for PR 11555 at commit 
[`40bd17a`](https://github.com/apache/spark/commit/40bd17a3d35b017d9af240da8a40df7e2998f610).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/11555#discussion_r55198266
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -204,11 +208,55 @@ class SQLBuilder(logicalPlan: LogicalPlan, 
sqlContext: SQLContext) extends Loggi
   private def build(segments: String*): String =
 segments.map(_.trim).filter(_.nonEmpty).mkString(" ")
 
+  private def updateQualifier(
+  expr: Expression,
+  qualifiers: Seq[(String, AttributeSet)]): Expression = {
+if (qualifiers.isEmpty) {
+  expr
+} else {
+  expr transform {
+case a: Attribute =>
+  val index = qualifiers.indexWhere {
+case (_, inputAttributes) => inputAttributes.contains(a)
+  }
+  if (index == -1) {
+a
+  } else {
+a.withQualifiers(qualifiers(index)._1 :: Nil)
+  }
+  }
+}
+  }
+
+  private def findQualifiers(input: LogicalPlan): Seq[(String, 
AttributeSet)] = {
+val results = mutable.ArrayBuffer.empty[(String, AttributeSet)]
+val nodes = mutable.Stack(input)
+
+while (nodes.nonEmpty) {
+  val node = nodes.pop()
+  node match {
+case SubqueryAlias(alias, child) => results += alias -> 
child.outputSet
+case _ => node.children.foreach(nodes.push)
+  }
+}
+
+results.toSeq
+  }
--- End diff --

So this method is basically a DFS search for all the outermost 
`SubqueryAlias` operators. Maybe the following version is clearer:

```scala
def findOutermostQualifiers(input: LogicalPlan): Seq[(String, 
AttributeSet)] = {
  input.collectFirst {
case SubqueryAlias(alias, child) => Seq(alias -> child.outputSet)
case plan => plan.children.flatMap(findOutermostQualifiers)
  }.toSeq.flatten
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...

2016-03-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11555#issuecomment-193231302
  
**[Test build #52558 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52558/consoleFull)**
 for PR 11555 at commit 
[`9a66fbb`](https://github.com/apache/spark/commit/9a66fbb756d78c393d2493dea5a8194bae1d61b5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

92 matches

Mail list logo