[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user saucam commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169523800 Thanks for the comments and the merge :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9858 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169413921 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169258612 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169258614 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48837/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169258458 **[Test build #48837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48837/consoleFull)** for PR 9858 at commit [`03ac259`](https://github.com/apache/spark/commit/03ac259de2915fd8689d1307f479019daab2a68c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169241986 **[Test build #48837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48837/consoleFull)** for PR 9858 at commit [`03ac259`](https://github.com/apache/spark/commit/03ac259de2915fd8689d1307f479019daab2a68c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user saucam commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-169240395 Hey @marmbrus have reverted to 46e7419 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9858#discussion_r48874115 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -361,8 +361,15 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] { case r @ logical.Range(start, end, step, numSlices, output) => execution.Range(start, step, numSlices, r.numElements, output) :: Nil case logical.RepartitionByExpression(expressions, child, nPartitions) => -execution.Exchange(HashPartitioning( - expressions, nPartitions.getOrElse(numPartitions)), planLater(child)) :: Nil +val newPartitioning = HashPartitioning(expressions, nPartitions.getOrElse(numPartitions)) +val childPlan = self.sqlContext.planner.planLater(child) +// This is necessary to get the outputPartitioning from EnsureRequirements batch +val executedChildPlan = sqlContext.prepareForExecution.execute(childPlan) --- End diff -- Oh, I see, because you need `EnsureRequirements` to have run. Thats actually pretty confusing. In that case, I think this version (46e741933d9737fb7e85f731da8fb8982e603ab7) is the cleanest implementation, but I'd add a comment about why we would both remove and add them. Sorry for the back and forth. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9858#discussion_r48873698 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -361,8 +361,15 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] { case r @ logical.Range(start, end, step, numSlices, output) => execution.Range(start, step, numSlices, r.numElements, output) :: Nil case logical.RepartitionByExpression(expressions, child, nPartitions) => -execution.Exchange(HashPartitioning( - expressions, nPartitions.getOrElse(numPartitions)), planLater(child)) :: Nil +val newPartitioning = HashPartitioning(expressions, nPartitions.getOrElse(numPartitions)) +val childPlan = self.sqlContext.planner.planLater(child) +// This is necessary to get the outputPartitioning from EnsureRequirements batch +val executedChildPlan = sqlContext.prepareForExecution.execute(childPlan) --- End diff -- Why is this not just: ```scala val childPlan = planLater(child) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user saucam commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-168956141 Seems some other issue , tests in pyspark mllib failing ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-168952173 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-168952176 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48751/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-168951738 **[Test build #48751 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48751/consoleFull)** for PR 9858 at commit [`dd2bdc8`](https://github.com/apache/spark/commit/dd2bdc8650e9db763ec3afe290919d8a15404e9d). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user saucam commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-168934153 Fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-168932871 **[Test build #48751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48751/consoleFull)** for PR 9858 at commit [`dd2bdc8`](https://github.com/apache/spark/commit/dd2bdc8650e9db763ec3afe290919d8a15404e9d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9858#discussion_r48777897 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -361,8 +361,15 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] { case r @ logical.Range(start, end, step, numSlices, output) => execution.Range(start, step, numSlices, r.numElements, output) :: Nil case logical.RepartitionByExpression(expressions, child, nPartitions) => -execution.Exchange(HashPartitioning( - expressions, nPartitions.getOrElse(numPartitions)), planLater(child)) :: Nil +val newPartitioning = HashPartitioning(expressions, nPartitions.getOrElse(numPartitions)) +val childPlan = self.sqlContext.planner.plan(child).next() --- End diff -- This should be `planLater(child)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167432877 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167432879 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48354/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167432837 **[Test build #48354 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48354/consoleFull)** for PR 9858 at commit [`5e5994d`](https://github.com/apache/spark/commit/5e5994d0caca2b8d3ef862ac7a76fd6dd5b51225). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167426083 **[Test build #48354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48354/consoleFull)** for PR 9858 at commit [`5e5994d`](https://github.com/apache/spark/commit/5e5994d0caca2b8d3ef862ac7a76fd6dd5b51225). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167423743 **[Test build #48353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48353/consoleFull)** for PR 9858 at commit [`1e1422d`](https://github.com/apache/spark/commit/1e1422d3039279cd62b988cc13e2ceba3c43b309). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167423751 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167423754 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48353/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167423222 **[Test build #48353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48353/consoleFull)** for PR 9858 at commit [`1e1422d`](https://github.com/apache/spark/commit/1e1422d3039279cd62b988cc13e2ceba3c43b309). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user saucam commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-167422237 Hey @marmbrus sorry for the delay in this update, I have added the same thing to the planner. Also rebased to latest master. How does it look now ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9858#discussion_r46209620 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala --- @@ -488,6 +488,12 @@ private[sql] case class EnsureRequirements(sqlContext: SQLContext) extends Rule[ } def apply(plan: SparkPlan): SparkPlan = plan.transformUp { +case operator @ Exchange(partitioning, child, _) => + child.children match { +case Exchange(childPartitioning, baseChild, _)::Nil => --- End diff -- The key difference about putting it in the planner instead of in the optimizer is that you can actually plan the child first and check its actual output ordering instead of doing this specific check for `Aggregate`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user saucam commented on a diff in the pull request: https://github.com/apache/spark/pull/9858#discussion_r46020227 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala --- @@ -488,6 +488,12 @@ private[sql] case class EnsureRequirements(sqlContext: SQLContext) extends Rule[ } def apply(plan: SparkPlan): SparkPlan = plan.transformUp { +case operator @ Exchange(partitioning, child, _) => + child.children match { +case Exchange(childPartitioning, baseChild, _)::Nil => --- End diff -- Yes, I thought the same, but then it will again be not as generic as this, since SparkStrategies are applied first and till that time we don;t have the exchanges added. So it will be similar to my previous change done in optimizer in that it will check that the child plan is an aggregate or not instead of testing for an Exchange. Will that be acceptable ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9858#discussion_r46002572 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala --- @@ -488,6 +488,12 @@ private[sql] case class EnsureRequirements(sqlContext: SQLContext) extends Rule[ } def apply(plan: SparkPlan): SparkPlan = plan.transformUp { +case operator @ Exchange(partitioning, child, _) => + child.children match { +case Exchange(childPartitioning, baseChild, _)::Nil => --- End diff -- Hey, so I know that I said to add the logic here, but now I'm realizing its kind of odd to add it into the plan and then remove it. Why don't we just never add it? Maybe this logic should go in SparkStrategies? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-159814436 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-159814439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46742/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-159814177 **[Test build #46742 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46742/consoleFull)** for PR 9858 at commit [`31a31a2`](https://github.com/apache/spark/commit/31a31a24a345ee2cd6137675b2808e4b7a709eb1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-159799524 **[Test build #46742 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46742/consoleFull)** for PR 9858 at commit [`31a31a2`](https://github.com/apache/spark/commit/31a31a24a345ee2cd6137675b2808e4b7a709eb1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158517844 I think it would be more general to do this during exchange planning. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158349969 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158349971 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46404/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158349801 **[Test build #46404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46404/consoleFull)** for PR 9858 at commit [`3ec2dbb`](https://github.com/apache/spark/commit/3ec2dbb950546477928cb4dcb2812cc370491b8b). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` class ChiSqSelectorModelWriter(instance: ChiSqSelectorModel) extends MLWriter `\n * `class PCA (override val uid: String) extends Estimator[PCAModel] with PCAParams`\n * ` class VectorIndexerModelWriter(instance: VectorIndexerModel) extends MLWriter `\n * `final class Word2Vec(override val uid: String) extends Estimator[Word2VecModel] with Word2VecBase`\n * ` class Word2VecModelWriter(instance: Word2VecModel) extends MLWriter `\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158318742 **[Test build #46404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46404/consoleFull)** for PR 9858 at commit [`3ec2dbb`](https://github.com/apache/spark/commit/3ec2dbb950546477928cb4dcb2812cc370491b8b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158314394 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46403/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158314388 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158314376 **[Test build #46403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46403/consoleFull)** for PR 9858 at commit [`a86feca`](https://github.com/apache/spark/commit/a86feca6e2b9aaba9babed8854a39c97b59f34cd). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * ` class ChiSqSelectorModelWriter(instance: ChiSqSelectorModel) extends MLWriter `\n * `class PCA (override val uid: String) extends Estimator[PCAModel] with PCAParams`\n * ` class VectorIndexerModelWriter(instance: VectorIndexerModel) extends MLWriter `\n * `final class Word2Vec(override val uid: String) extends Estimator[Word2VecModel] with Word2VecBase`\n * ` class Word2VecModelWriter(instance: Word2VecModel) extends MLWriter `\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11878][SQL]: Eliminate distribute by in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9858#issuecomment-158314030 **[Test build #46403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46403/consoleFull)** for PR 9858 at commit [`a86feca`](https://github.com/apache/spark/commit/a86feca6e2b9aaba9babed8854a39c97b59f34cd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org