[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138672312 Thanks, merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8486 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138412857 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42108/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138412855 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138412807 [Test build #42108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42108/console) for PR 8486 at commit [`b44f74b`](https://github.com/apache/spark/commit/b44f74bebe872bb6716ee920153b7c5e28e1df52). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138398769 [Test build #42108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42108/consoleFull) for PR 8486 at commit [`b44f74b`](https://github.com/apache/spark/commit/b44f74bebe872bb6716ee920153b7c5e28e1df52). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138397668 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-138397610 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136662615 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41863/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136662613 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136662502 [Test build #41863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41863/console) for PR 8486 at commit [`b544706`](https://github.com/apache/spark/commit/b5447067c3394981f417435465c7c6cf961808ac). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136623325 [Test build #41863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41863/consoleFull) for PR 8486 at commit [`b544706`](https://github.com/apache/spark/commit/b5447067c3394981f417435465c7c6cf961808ac). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136620710 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136620755 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/8486#discussion_r38393705 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -26,27 +24,6 @@ import org.apache.spark.sql.catalyst.plans._ import org.apache.spark.sql.catalyst.plans.logical._ /** - * A pattern that matches any number of filter operations on top of another relational operator. - * Adjacent filter operators are collected and their conditions are broken up and returned as a - * sequence of conjunctive predicates. - * - * @return A tuple containing a sequence of conjunctive predicates that should be used to filter the - * output and a relational operator. - */ -object FilteredOperation extends PredicateHelper { --- End diff -- The `FilteredOperation` is not used anywhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136518354 Where possible, I think its good to reduce coupling between these two components. There are also cases where we have used `PhysicalOperation` before the optimizer. So, while I don't think its a hard constraint, I'd prefer a solution that doesn't make this assumption. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136420899 My thought is that: If there is no non-deterministic expressions, we will always push down `Filter` through `Project`, and we will always combine adjacent `Filter`s and `Project`s. So no matter there exists non-deterministic expressions or not, we just need to collect the lowest one `Project` and `Filter`. Will we break this assumption of `Optimizer` in the future? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-136036533 This patch seems to do significantly more than it claims in the description. Why not just avoid collapsing projections that have non-determistic expressions in them. In general, it seems good to let all phases of planning be as flexible as possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135632555 cc @rxin @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135496815 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41690/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135496445 [Test build #41690 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41690/console) for PR 8486 at commit [`16ae7e2`](https://github.com/apache/spark/commit/16ae7e2394caf6a3925cb3c69692b4b14c7811cb). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135496813 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135460635 [Test build #41690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41690/consoleFull) for PR 8486 at commit [`16ae7e2`](https://github.com/apache/spark/commit/16ae7e2394caf6a3925cb3c69692b4b14c7811cb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135458247 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8486#issuecomment-135458320 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/8486#discussion_r38103673 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -26,83 +24,28 @@ import org.apache.spark.sql.catalyst.plans._ import org.apache.spark.sql.catalyst.plans.logical._ /** - * A pattern that matches any number of filter operations on top of another relational operator. - * Adjacent filter operators are collected and their conditions are broken up and returned as a - * sequence of conjunctive predicates. - * - * @return A tuple containing a sequence of conjunctive predicates that should be used to filter the - * output and a relational operator. + * A pattern that matches at most one Filter and one Project on top of another relational operator. + * Filter condition is broken up to conjunctive parts. */ -object FilteredOperation extends PredicateHelper { --- End diff -- The `FilteredOperation` is not used anywhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10316][SQL] respect nondeterministic ex...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/8486 [SPARK-10316][SQL] respect nondeterministic expressions in PhysicalOperation We did a lot of special handling for non-deterministic expressions in `Optimizer`. However, `PhysicalOperation` just collects all Projects and Filters and mess it up. We should respect the operators order caused by non-deterministic expressions in `PhysicalOperation`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8486.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8486 commit 16ae7e2394caf6a3925cb3c69692b4b14c7811cb Author: Wenchen Fan Date: 2015-08-27T14:37:46Z respect nondeterministic expressions in PhysicalOperation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org