[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-06 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169523800
  
Thanks for the comments and the merge :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9858


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-06 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169413921
  
Thanks!  Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169258612
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169258614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48837/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169258458
  
**[Test build #48837 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48837/consoleFull)**
 for PR 9858 at commit 
[`03ac259`](https://github.com/apache/spark/commit/03ac259de2915fd8689d1307f479019daab2a68c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169241986
  
**[Test build #48837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48837/consoleFull)**
 for PR 9858 at commit 
[`03ac259`](https://github.com/apache/spark/commit/03ac259de2915fd8689d1307f479019daab2a68c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-169240395
  
Hey @marmbrus have reverted to 46e7419


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9858#discussion_r48874115
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -361,8 +361,15 @@ private[sql] abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
   case r @ logical.Range(start, end, step, numSlices, output) =>
 execution.Range(start, step, numSlices, r.numElements, output) :: 
Nil
   case logical.RepartitionByExpression(expressions, child, 
nPartitions) =>
-execution.Exchange(HashPartitioning(
-  expressions, nPartitions.getOrElse(numPartitions)), 
planLater(child)) :: Nil
+val newPartitioning = HashPartitioning(expressions, 
nPartitions.getOrElse(numPartitions))
+val childPlan = self.sqlContext.planner.planLater(child)
+// This is necessary to get the outputPartitioning from 
EnsureRequirements batch
+val executedChildPlan = 
sqlContext.prepareForExecution.execute(childPlan)
--- End diff --

Oh, I see, because you need `EnsureRequirements` to have run.  Thats 
actually pretty confusing.  In that case, I think this version 
(46e741933d9737fb7e85f731da8fb8982e603ab7) is the cleanest implementation, but 
I'd add a comment about why we would both remove and add them.  Sorry for the 
back and forth.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9858#discussion_r48873698
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -361,8 +361,15 @@ private[sql] abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
   case r @ logical.Range(start, end, step, numSlices, output) =>
 execution.Range(start, step, numSlices, r.numElements, output) :: 
Nil
   case logical.RepartitionByExpression(expressions, child, 
nPartitions) =>
-execution.Exchange(HashPartitioning(
-  expressions, nPartitions.getOrElse(numPartitions)), 
planLater(child)) :: Nil
+val newPartitioning = HashPartitioning(expressions, 
nPartitions.getOrElse(numPartitions))
+val childPlan = self.sqlContext.planner.planLater(child)
+// This is necessary to get the outputPartitioning from 
EnsureRequirements batch
+val executedChildPlan = 
sqlContext.prepareForExecution.execute(childPlan)
--- End diff --

Why is this not just:
```scala
val childPlan = planLater(child)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-168956141
  
Seems some other issue , tests in pyspark mllib failing ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-168952173
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-168952176
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48751/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-168951738
  
**[Test build #48751 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48751/consoleFull)**
 for PR 9858 at commit 
[`dd2bdc8`](https://github.com/apache/spark/commit/dd2bdc8650e9db763ec3afe290919d8a15404e9d).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-168934153
  
Fixed



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-168932871
  
**[Test build #48751 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48751/consoleFull)**
 for PR 9858 at commit 
[`dd2bdc8`](https://github.com/apache/spark/commit/dd2bdc8650e9db763ec3afe290919d8a15404e9d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2016-01-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9858#discussion_r48777897
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -361,8 +361,15 @@ private[sql] abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
   case r @ logical.Range(start, end, step, numSlices, output) =>
 execution.Range(start, step, numSlices, r.numElements, output) :: 
Nil
   case logical.RepartitionByExpression(expressions, child, 
nPartitions) =>
-execution.Exchange(HashPartitioning(
-  expressions, nPartitions.getOrElse(numPartitions)), 
planLater(child)) :: Nil
+val newPartitioning = HashPartitioning(expressions, 
nPartitions.getOrElse(numPartitions))
+val childPlan = self.sqlContext.planner.plan(child).next()
--- End diff --

This should be `planLater(child)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167432877
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167432879
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48354/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167432837
  
**[Test build #48354 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48354/consoleFull)**
 for PR 9858 at commit 
[`5e5994d`](https://github.com/apache/spark/commit/5e5994d0caca2b8d3ef862ac7a76fd6dd5b51225).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167426083
  
**[Test build #48354 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48354/consoleFull)**
 for PR 9858 at commit 
[`5e5994d`](https://github.com/apache/spark/commit/5e5994d0caca2b8d3ef862ac7a76fd6dd5b51225).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167423743
  
**[Test build #48353 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48353/consoleFull)**
 for PR 9858 at commit 
[`1e1422d`](https://github.com/apache/spark/commit/1e1422d3039279cd62b988cc13e2ceba3c43b309).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167423751
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167423754
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48353/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167423222
  
**[Test build #48353 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48353/consoleFull)**
 for PR 9858 at commit 
[`1e1422d`](https://github.com/apache/spark/commit/1e1422d3039279cd62b988cc13e2ceba3c43b309).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-12-27 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-167422237
  
Hey @marmbrus sorry for the delay in this update, I have added the same 
thing to the planner. Also rebased to latest master. How does it look now ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-30 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9858#discussion_r46209620
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala ---
@@ -488,6 +488,12 @@ private[sql] case class EnsureRequirements(sqlContext: 
SQLContext) extends Rule[
   }
 
   def apply(plan: SparkPlan): SparkPlan = plan.transformUp {
+case operator @ Exchange(partitioning, child, _) =>
+  child.children match {
+case Exchange(childPartitioning, baseChild, _)::Nil =>
--- End diff --

The key difference about putting it in the planner instead of in the 
optimizer is that you can actually plan the child first and check its actual 
output ordering instead of doing this specific check for `Aggregate`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-26 Thread saucam
Github user saucam commented on a diff in the pull request:

https://github.com/apache/spark/pull/9858#discussion_r46020227
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala ---
@@ -488,6 +488,12 @@ private[sql] case class EnsureRequirements(sqlContext: 
SQLContext) extends Rule[
   }
 
   def apply(plan: SparkPlan): SparkPlan = plan.transformUp {
+case operator @ Exchange(partitioning, child, _) =>
+  child.children match {
+case Exchange(childPartitioning, baseChild, _)::Nil =>
--- End diff --

Yes, I thought the same, but then it will again be not as generic as this, 
since SparkStrategies are applied first and till that time we don;t have the 
exchanges added. So it will be similar to my previous change done in optimizer 
in that it will check that the child plan is an aggregate or not instead of 
testing for an Exchange. Will that be acceptable ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-26 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9858#discussion_r46002572
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala ---
@@ -488,6 +488,12 @@ private[sql] case class EnsureRequirements(sqlContext: 
SQLContext) extends Rule[
   }
 
   def apply(plan: SparkPlan): SparkPlan = plan.transformUp {
+case operator @ Exchange(partitioning, child, _) =>
+  child.children match {
+case Exchange(childPartitioning, baseChild, _)::Nil =>
--- End diff --

Hey, so I know that I said to add the logic here, but now I'm realizing its 
kind of odd to add it into the plan and then remove it.  Why don't we just 
never add it?  Maybe this logic should go in SparkStrategies?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-159814436
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-159814439
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46742/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-159814177
  
**[Test build #46742 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46742/consoleFull)**
 for PR 9858 at commit 
[`31a31a2`](https://github.com/apache/spark/commit/31a31a24a345ee2cd6137675b2808e4b7a709eb1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-159799524
  
**[Test build #46742 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46742/consoleFull)**
 for PR 9858 at commit 
[`31a31a2`](https://github.com/apache/spark/commit/31a31a24a345ee2cd6137675b2808e4b7a709eb1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-20 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158517844
  
I think it would be more general to do this during exchange planning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158349969
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158349971
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46404/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158349801
  
**[Test build #46404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46404/consoleFull)**
 for PR 9858 at commit 
[`3ec2dbb`](https://github.com/apache/spark/commit/3ec2dbb950546477928cb4dcb2812cc370491b8b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `  
class ChiSqSelectorModelWriter(instance: ChiSqSelectorModel) extends MLWriter 
`\n  * `class PCA (override val uid: String) extends Estimator[PCAModel] with 
PCAParams`\n  * `  class VectorIndexerModelWriter(instance: VectorIndexerModel) 
extends MLWriter `\n  * `final class Word2Vec(override val uid: String) extends 
Estimator[Word2VecModel] with Word2VecBase`\n  * `  class 
Word2VecModelWriter(instance: Word2VecModel) extends MLWriter `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158318742
  
**[Test build #46404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46404/consoleFull)**
 for PR 9858 at commit 
[`3ec2dbb`](https://github.com/apache/spark/commit/3ec2dbb950546477928cb4dcb2812cc370491b8b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158314394
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46403/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158314388
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158314376
  
**[Test build #46403 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46403/consoleFull)**
 for PR 9858 at commit 
[`a86feca`](https://github.com/apache/spark/commit/a86feca6e2b9aaba9babed8854a39c97b59f34cd).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * `  
class ChiSqSelectorModelWriter(instance: ChiSqSelectorModel) extends MLWriter 
`\n  * `class PCA (override val uid: String) extends Estimator[PCAModel] with 
PCAParams`\n  * `  class VectorIndexerModelWriter(instance: VectorIndexerModel) 
extends MLWriter `\n  * `final class Word2Vec(override val uid: String) extends 
Estimator[Word2VecModel] with Word2VecBase`\n  * `  class 
Word2VecModelWriter(instance: Word2VecModel) extends MLWriter `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...

2015-11-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9858#issuecomment-158314030
  
**[Test build #46403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46403/consoleFull)**
 for PR 9858 at commit 
[`a86feca`](https://github.com/apache/spark/commit/a86feca6e2b9aaba9babed8854a39c97b59f34cd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org