[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2825 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59860308 Thanks, merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59594302 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21870/consoleFull) for PR 2825 at commit [`abbc53c`](https://github.com/apache/spark/commit/abbc53cc9b1e02d19c2f2200947bcb86bf33511c). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59594304 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21870/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59592260 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21870/consoleFull) for PR 2825 at commit [`abbc53c`](https://github.com/apache/spark/commit/abbc53cc9b1e02d19c2f2200947bcb86bf33511c). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/2825#discussion_r19050137 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.catalyst.rules._ import org.apache.spark.sql.catalyst.types._ -object Optimizer extends RuleExecutor[LogicalPlan] { +abstract class Optimizer extends RuleExecutor[LogicalPlan] + +object SparkOptimizer extends Optimizer { --- End diff -- Thank you for your suggestion. I agree that Catalyst should not tightly coupled with Spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/2825#discussion_r19050063 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExpressionOptimizationSuite.scala --- @@ -30,7 +30,7 @@ class ExpressionOptimizationSuite extends ExpressionEvaluationSuite { expected: Any, inputRow: Row = EmptyRow): Unit = { val plan = Project(Alias(expression, s"Optimized($expression)")() :: Nil, NoRelation) -val optimizedPlan = Optimizer(plan) +val optimizedPlan = SparkOptimizer(plan) super.checkEvaluation(optimizedPlan.expressions.head, expected, inputRow) } } --- End diff -- Of course not. I'll add a new line. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59506552 Two minor comments. This LGTM, thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2825#discussion_r19014954 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.catalyst.rules._ import org.apache.spark.sql.catalyst.types._ -object Optimizer extends RuleExecutor[LogicalPlan] { +abstract class Optimizer extends RuleExecutor[LogicalPlan] + +object SparkOptimizer extends Optimizer { --- End diff -- I think `DefaultOptimizer` or `BasicOptimizer` may be better names. Conceptually, Catalyst is not tightly coupled with Spark (I know right now Catalyst depends on Spark code somehow, but we don't want to exacerbate this trend). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2825#discussion_r19014836 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExpressionOptimizationSuite.scala --- @@ -30,7 +30,7 @@ class ExpressionOptimizationSuite extends ExpressionEvaluationSuite { expected: Any, inputRow: Row = EmptyRow): Unit = { val plan = Project(Alias(expression, s"Optimized($expression)")() :: Nil, NoRelation) -val optimizedPlan = Optimizer(plan) +val optimizedPlan = SparkOptimizer(plan) super.checkEvaluation(optimizedPlan.expressions.head, expected, inputRow) } } --- End diff -- Would you mind to do a favor to add a new line here? :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59461447 Hi @chenghao-intel, thank you for your comment. Yes, that's right. I don't want to mix the logical plan & physical plan optimization and I'll extend `SparkStrategies` if I need. I want to add some `Expression`s for my projects and optimize logical plan with them, but right now we can't replace `Optimizer` because it is an `object`, so I want to add an extension point for `Optimizer`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59457217 Sorry, I mean probably we don't want to mix the logical plan & physical plan optimization, the logical plan optimization should be generic, if we want to do some customized optimization for specific engine (e.g. MapReduce / Spark), we'd better to do that in physical plan, e.g. `SparkStrategies.scala` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59456929 I think people want to customize something other than default, a better way is to create an new `Context` class which derives from `SQLContext`, just like the `HiveContext` does. In the other hand, since the `Catalyst` focus on the logical plan optimization, why not just keep adding / updating the rules (object)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59453264 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21827/consoleFull) for PR 2825 at commit [`4d2e1bc`](https://github.com/apache/spark/commit/4d2e1bc8126c815cd393e1b527fa19b727b8d4c4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59453270 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21827/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59450880 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21827/consoleFull) for PR 2825 at commit [`4d2e1bc`](https://github.com/apache/spark/commit/4d2e1bc8126c815cd393e1b527fa19b727b8d4c4). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/2825#discussion_r18995386 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.catalyst.rules._ import org.apache.spark.sql.catalyst.types._ -object Optimizer extends RuleExecutor[LogicalPlan] { +abstract class Optimizer extends RuleExecutor[LogicalPlan] + +object Optimizer extends Optimizer { --- End diff -- Hi @srowen, thank you for your comment. I'll rename the `object`. And yes, this PR is a very little API change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59388681 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21808/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59388676 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21808/consoleFull) for PR 2825 at commit [`9547a23`](https://github.com/apache/spark/commit/9547a23fd5ca0058d1044d19f6a96bdbb1e3b810). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2825#discussion_r18964661 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.catalyst.rules._ import org.apache.spark.sql.catalyst.types._ -object Optimizer extends RuleExecutor[LogicalPlan] { +abstract class Optimizer extends RuleExecutor[LogicalPlan] + +object Optimizer extends Optimizer { --- End diff -- It is confusing to use the same name for both. Will this mean an API change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2825#issuecomment-59379459 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21808/consoleFull) for PR 2825 at commit [`9547a23`](https://github.com/apache/spark/commit/9547a23fd5ca0058d1044d19f6a96bdbb1e3b810). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org