[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196569575 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53116/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196569574 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196569211 **[Test build #53116 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53116/consoleFull)** for PR 11682 at commit [`608b901`](https://github.com/apache/spark/commit/608b9014813174e2227389c7919074ed743fde47). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196566777 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53119/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196566772 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196565770 **[Test build #53119 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53119/consoleFull)** for PR 11682 at commit [`f1eee03`](https://github.com/apache/spark/commit/f1eee03aa8d745044f18fd97c2dca543c421497f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196560475 **[Test build #53121 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53121/consoleFull)** for PR 11682 at commit [`adf64da`](https://github.com/apache/spark/commit/adf64da3c3f86e6addb6cb813deba36bb4c7debf). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196560839 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196560843 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53121/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196537010 **[Test build #53121 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53121/consoleFull)** for PR 11682 at commit [`adf64da`](https://github.com/apache/spark/commit/adf64da3c3f86e6addb6cb813deba36bb4c7debf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196535654 **[Test build #53119 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53119/consoleFull)** for PR 11682 at commit [`f1eee03`](https://github.com/apache/spark/commit/f1eee03aa8d745044f18fd97c2dca543c421497f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196531576 **[Test build #53116 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53116/consoleFull)** for PR 11682 at commit [`608b901`](https://github.com/apache/spark/commit/608b9014813174e2227389c7919074ed743fde47). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196442838 Yeah, that is a good idea. Let me try it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196433895 After some offline discussions with @marmbrus , we realized that these two rules may still conflict with each other if the Filter can'be pushed throughout the child (for example, outer join). A better solution could be split the ColumnPruning as two rules: a) the first one add new Project 2) the second one remove unnecessary projects, include the Project under Filter (the Project should only prune some columns). The second rule ran just after the first role. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196431220 @davies Will try to find a way to keep both. Will submit a separate PR for handling non-determisitic Filter. @marmbrus Yeah, your concern is valid. Will submit a separate PR to do what you said above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196425642 I'm worried that we are adding rules that aren't stable. Perhaps we should make the rule executor throw an error when the testing flag is set if we ever hit the maximum number of iterations. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196423960 @gatorsmile Thanks for finding this. In general, both PushPredicateThroughProject and ColumnPruning are useful, we should keep both and fix the conflict (make them stable). I think they are only conflicted with each other when they are on opt of a LeafNode (can't be pushed further), so we don't need to insert a Project between Filter and LeafNode, because PhysicalOperation can already handle Project(Filter(LeafNode())) very well. For the non-deterministic Filter or Project, we may do things wrong in many places, could you create a separate JIRA for that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196388382 @yhuai This was added in the recent change in `ColumnPruning`. 1.6 does not have such an issue. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196380248 @gatorsmile Is this behavior new (introduced by a recent change) or previous versions of the optimizer also have the same behavior? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196131490 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53046/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196131489 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196131389 **[Test build #53046 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53046/consoleFull)** for PR 11682 at commit [`e5e00ae`](https://github.com/apache/spark/commit/e5e00ae1d0f9885b05e3a81f8b084e1059151fba). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-196110718 **[Test build #53046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53046/consoleFull)** for PR 11682 at commit [`e5e00ae`](https://github.com/apache/spark/commit/e5e00ae1d0f9885b05e3a81f8b084e1059151fba). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-195862035 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-195862036 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53016/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-195862015 **[Test build #53016 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53016/consoleFull)** for PR 11682 at commit [`e128a0a`](https://github.com/apache/spark/commit/e128a0a3d23b7a6a37cdc77034a651d79e32c451). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11682#issuecomment-195850299 **[Test build #53016 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53016/consoleFull)** for PR 11682 at commit [`e128a0a`](https://github.com/apache/spark/commit/e128a0a3d23b7a6a37cdc77034a651d79e32c451). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13840] [SQL] Disable Project Pushdown T...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/11682 [SPARK-13840] [SQL] Disable Project Pushdown Through Filter What changes were proposed in this pull request? Before this PR, two Optimizer rules `ColumnPruning` and `PushPredicateThroughProject` reverse each other's effects. Optimizer always reaches the max iteration when optimizing some queries. Thus, we need to disable `Project` push down through `Filter` in the rule `columnPruning`. The issue becomes worse when having another rule `NullFiltering`, which could add extra Filters for `IsNotNull`. We have to be careful when introducing extra `Filter` if the benefit is not large enough. cc @sameeragarwal @marmbrus In addition, `ColumnPruning` should not push `Project` through non-deterministic `Filter`. This could cause wrong results. cc @davies @cloud-fan @yhuai How was this patch tested? Modified the existing test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark viewDuplicateNames Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11682.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11682 commit e128a0a3d23b7a6a37cdc77034a651d79e32c451 Author: gatorsmileDate: 2016-03-13T02:00:04Z Disable Project Push Down Through Filter --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org