[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-212013116 Thank you for merging, @cloud-fan ! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-212013573 Also, thank you so much for your direct guidance, @rxin . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211926588 thanks! merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12353 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211843480 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211843485 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56212/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211842450 **[Test build #56212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56212/consoleFull)** for PR 12353 at commit [`9a2340c`](https://github.com/apache/spark/commit/9a2340c0b6802a3e64f5f4a1e1b39195fc2b8257). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211821253 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56210/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211821250 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211820975 **[Test build #56210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56210/consoleFull)** for PR 12353 at commit [`502bc61`](https://github.com/apache/spark/commit/502bc61eb60f90d024d6af29331ac55ec3eb129d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211804819 **[Test build #56212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56212/consoleFull)** for PR 12353 at commit [`9a2340c`](https://github.com/apache/spark/commit/9a2340c0b6802a3e64f5f4a1e1b39195fc2b8257). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211803070 Thank you for review, @cloud-fan . It's fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60192015 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -864,6 +866,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper { } /** + * Optimizes expressions by replacing according to CodeGen configuration. + */ +case class OptimizeCodegen(conf: CatalystConf) extends Rule[LogicalPlan] { + def apply(plan: LogicalPlan): LogicalPlan = plan transformAllExpressions { +case e @ CaseWhen(branches, elseCase) if branches.size < conf.maxCaseBranchesForCodegen => --- End diff -- Thank you, @cloud-fan ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211798835 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60191639 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -864,6 +866,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper { } /** + * Optimizes expressions by replacing according to CodeGen configuration. + */ +case class OptimizeCodegen(conf: CatalystConf) extends Rule[LogicalPlan] { + def apply(plan: LogicalPlan): LogicalPlan = plan transformAllExpressions { +case e @ CaseWhen(branches, elseCase) if branches.size < conf.maxCaseBranchesForCodegen => --- End diff -- nit: the `elseCase` is not used --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211783888 Now, the followings are updated. - `maxCaseBranches` is renamed to `maxCaseBranchesForCodegen`. - object `CaseWhenCodegen` is removed. - CaseWhen has 'toCodegen` function. - 3 testcases added: nested `CaseWhen`, multiple `CaseWhen` in one operator, multiple `CaseWhen` in multiple operators. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211782737 **[Test build #56210 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56210/consoleFull)** for PR 12353 at commit [`502bc61`](https://github.com/apache/spark/commit/502bc61eb60f90d024d6af29331ac55ec3eb129d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211773529 I don't think we can do that unless we "fix" Literal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211771977 can we mark `CodegenFallback.doCodeGen` as final? so that we can guarantee that expressions implement `CodegenFallback` will always fallback. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60180011 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeCodegenSuite.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.SimpleCatalystConf +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.expressions.Literal._ +import org.apache.spark.sql.catalyst.plans.PlanTest +import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.rules._ + + +class OptimizeCodegenSuite extends PlanTest { + + object Optimize extends RuleExecutor[LogicalPlan] { +val batches = Batch("OptimizeCodegen", Once, OptimizeCodegen(SimpleCatalystConf(true))) :: Nil + } + + protected def assertEquivalent(e1: Expression, e2: Expression): Unit = { +val correctAnswer = Project(Alias(e2, "out")() :: Nil, OneRowRelation).analyze +val actual = Optimize.execute(Project(Alias(e1, "out")() :: Nil, OneRowRelation).analyze) +comparePlans(actual, correctAnswer) + } + + test("Codegen only when the number of branches is small.") { --- End diff -- Oh. Sure. I'll add those testcases, too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60179863 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -142,16 +139,54 @@ case class CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[E } } - def shouldCodegen: Boolean = { -branches.length < CaseWhen.MAX_NUM_CASES_FOR_CODEGEN + override def toString: String = { +val cases = branches.map { case (c, v) => s" WHEN $c THEN $v" }.mkString +val elseCase = elseValue.map(" ELSE " + _).getOrElse("") +"CASE" + cases + elseCase + " END" } + override def sql: String = { +val cases = branches.map { case (c, v) => s" WHEN ${c.sql} THEN ${v.sql}" }.mkString +val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") +"CASE" + cases + elseCase + " END" + } +} + + +/** + * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". + * When a = true, returns b; when c = true, returns d; else returns e. + * + * @param branches seq of (branch condition, branch value) + * @param elseValue optional value for the else branch + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e.") +// scalastyle:on line.size.limit +case class CaseWhen( +val branches: Seq[(Expression, Expression)], +val elseValue: Option[Expression] = None) + extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { --- End diff -- That would be right. `CaseWhenCodegen` is always generated from `CaseWhen`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60179727 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystConf.scala --- @@ -29,6 +29,7 @@ trait CatalystConf { def groupByOrdinal: Boolean def optimizerMaxIterations: Int + def maxCaseBranches: Int --- End diff -- Thank you for quick review. Sure. And also `maxCaseBranchesForCodegen` in SQLConf.scala. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60177562 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -242,6 +261,12 @@ object CaseWhen { } } +/** Factory methods for CaseWhenCodegen. */ +object CaseWhenCodegen { --- End diff -- we can remove this given the above comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211751566 cc @cloud-fan this change actually makes your other thing easier i think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60177491 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeCodegenSuite.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.SimpleCatalystConf +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.expressions.Literal._ +import org.apache.spark.sql.catalyst.plans.PlanTest +import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.rules._ + + +class OptimizeCodegenSuite extends PlanTest { + + object Optimize extends RuleExecutor[LogicalPlan] { +val batches = Batch("OptimizeCodegen", Once, OptimizeCodegen(SimpleCatalystConf(true))) :: Nil + } + + protected def assertEquivalent(e1: Expression, e2: Expression): Unit = { +val correctAnswer = Project(Alias(e2, "out")() :: Nil, OneRowRelation).analyze +val actual = Optimize.execute(Project(Alias(e1, "out")() :: Nil, OneRowRelation).analyze) +comparePlans(actual, correctAnswer) + } + + test("Codegen only when the number of branches is small.") { --- End diff -- can you make sure you construct a few more test cases one with nested casewhen, and one with multiple case when in one operator, and one with multiple casewhen in different operators --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60177331 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -142,16 +139,54 @@ case class CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[E } } - def shouldCodegen: Boolean = { -branches.length < CaseWhen.MAX_NUM_CASES_FOR_CODEGEN + override def toString: String = { +val cases = branches.map { case (c, v) => s" WHEN $c THEN $v" }.mkString +val elseCase = elseValue.map(" ELSE " + _).getOrElse("") +"CASE" + cases + elseCase + " END" } + override def sql: String = { +val cases = branches.map { case (c, v) => s" WHEN ${c.sql} THEN ${v.sql}" }.mkString +val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") +"CASE" + cases + elseCase + " END" + } +} + + +/** + * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". + * When a = true, returns b; when c = true, returns d; else returns e. + * + * @param branches seq of (branch condition, branch value) + * @param elseValue optional value for the else branch + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e.") +// scalastyle:on line.size.limit +case class CaseWhen( +val branches: Seq[(Expression, Expression)], +val elseValue: Option[Expression] = None) + extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { --- End diff -- maybe just have a toCodegen function that creates CaseWhenCodegen? We can then remove `object CaseWhenCodegen` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60177186 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystConf.scala --- @@ -29,6 +29,7 @@ trait CatalystConf { def groupByOrdinal: Boolean def optimizerMaxIterations: Int + def maxCaseBranches: Int --- End diff -- maxCaseBranchesForCodegen? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211740746 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56191/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211740744 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211740602 **[Test build #56191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56191/consoleFull)** for PR 12353 at commit [`a9294bd`](https://github.com/apache/spark/commit/a9294bdd01c125dcc7a7b232a7b14b476678e731). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211726421 Hi, @rxin . Now, the PR is updated in the following ways. 1. `CaseWhen` split into three classes: - `CaseWhenBase`: abstract base class - `CaseWhen`: codegen fallback - `CaseWhenCodegen`: codegen 1. `OptimizeCodegen` optimizer added: - Replace `CaseWhen` into `CaseWhenCodegen` if branches.size < conf.maxCasesBranches. 1. `CodegenConf` is removed and `CatalystConf` has `maxCasesBranches`. (`SQLConf` does not changed.) - Since `Optimizer` uses that configuration, I think `CatalystConf` becomes more proper place. How do you think about item 3? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211725663 **[Test build #56191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56191/consoleFull)** for PR 12353 at commit [`a9294bd`](https://github.com/apache/spark/commit/a9294bdd01c125dcc7a7b232a7b14b476678e731). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211600222 Might be best to create a OptimizeCodegen rule as the very last batch. We can add other things to that rule in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211600358 Thanks. I will create `OptimizeCodegen` then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211599434 For optimizer, may I implement this in `SimplifyConditionals`? Or, should I create another one like `CaseWhenCodegen`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211592882 I see. Thank you for the solution! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-211582991 OK I think about this more. Actually to make this really work, we should just create two expressions, one for codegen version and the other for interpreted (default). And in the optimizer we switch to the codegen version if the number of branches is less than x. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r60096003 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -305,7 +307,7 @@ case class WholeStageCodegen(child: SparkPlan) extends UnaryNode with CodegenSup * @return the tuple of the codegen context and the actual generated source. */ def doCodeGen(): (CodegenContext, String) = { -val ctx = new CodegenContext +val ctx = new CodegenContext(SQLContext.getActive().get.conf) --- End diff -- Oh, I haven't test it non-local mode. If then, is there any way to access the configuration of SQLContext in executors? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r59974489 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -305,7 +307,7 @@ case class WholeStageCodegen(child: SparkPlan) extends UnaryNode with CodegenSup * @return the tuple of the codegen context and the actual generated source. */ def doCodeGen(): (CodegenContext, String) = { -val ctx = new CodegenContext +val ctx = new CodegenContext(SQLContext.getActive().get.conf) --- End diff -- Is this a problem on non-local mode? SQLContext.getActive is not available on the executors. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210875400 Actually, it's not needed. If there are some missing things to do, could you give me some advice? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210762825 Oh, sorry. I didn't say explicitly. I thought updating `private[spark] trait CodegenConf` into `trait CodegenConf` because CatalystConf changed like that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210726342 @dongjoon-hyun what's your idea on how to update this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210619095 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55943/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210619093 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210618748 **[Test build #55943 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55943/consoleFull)** for PR 12353 at commit [`6842e68`](https://github.com/apache/spark/commit/6842e68350ad10d5e6a8b349a695abd56bdfc980). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210584514 **[Test build #55943 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55943/consoleFull)** for PR 12353 at commit [`6842e68`](https://github.com/apache/spark/commit/6842e68350ad10d5e6a8b349a695abd56bdfc980). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-210580549 Rebased to resolve conflicts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209867510 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55808/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209867507 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209867332 **[Test build #55808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55808/consoleFull)** for PR 12353 at commit [`fe8510a`](https://github.com/apache/spark/commit/fe8510a9fb2c0d89b29b15c56a8090bb99065933). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209837462 **[Test build #55808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55808/consoleFull)** for PR 12353 at commit [`fe8510a`](https://github.com/apache/spark/commit/fe8510a9fb2c0d89b29b15c56a8090bb99065933). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209836535 Rebased to resolve conflicts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209669387 Hi, @rxin . Now, this PR is ready to be review. Could you review this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209667307 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55740/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209667304 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209667080 **[Test build #55740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55740/consoleFull)** for PR 12353 at commit [`2d7a509`](https://github.com/apache/spark/commit/2d7a5094b83ed26d447650d1bdba38011b12f34b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209639623 **[Test build #55740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55740/consoleFull)** for PR 12353 at commit [`2d7a509`](https://github.com/apache/spark/commit/2d7a5094b83ed26d447650d1bdba38011b12f34b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r59586139 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -620,7 +622,7 @@ abstract class CodeGenerator[InType <: AnyRef, OutType <: AnyRef] extends Loggin * expressions that don't support codegen */ def newCodeGenContext(): CodegenContext = { -new CodegenContext +new CodegenContext(new SimpleCodegenConf) --- End diff -- Hmm. `CodegenContext` had better be a parameter. I will fix this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209355042 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55705/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209355037 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209354409 **[Test build #55705 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55705/consoleFull)** for PR 12353 at commit [`61b2dc3`](https://github.com/apache/spark/commit/61b2dc3e4950566dc1a6f0e12d20e1fb5ecc7ae2). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SimpleCodegenConf(maxCaseBranches: Int = 20) extends CodegenConf ` * `class CodegenContext(codegenConf: CodegenConf) ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209326070 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209326074 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55702/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209325643 **[Test build #55702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55702/consoleFull)** for PR 12353 at commit [`6295de5`](https://github.com/apache/spark/commit/6295de51a69c416849ed0e9aa685ee72b2d97a12). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209311322 **[Test build #55705 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55705/consoleFull)** for PR 12353 at commit [`61b2dc3`](https://github.com/apache/spark/commit/61b2dc3e4950566dc1a6f0e12d20e1fb5ecc7ae2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r59509786 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -52,6 +52,8 @@ case class ExprCode(var code: String, var isNull: String, var value: String) */ class CodegenContext { + var conf: CatalystConf = null --- End diff -- Oh, Sure. I'll change those things. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209291645 **[Test build #55702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55702/consoleFull)** for PR 12353 at commit [`6295de5`](https://github.com/apache/spark/commit/6295de51a69c416849ed0e9aa685ee72b2d97a12). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12353#discussion_r59506777 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -52,6 +52,8 @@ case class ExprCode(var code: String, var isNull: String, var value: String) */ class CodegenContext { + var conf: CatalystConf = null --- End diff -- this is really hacky -- i'd put this in the constructor and make it a val rather than a var. and maybe we can create a CodegenConf instead of reusing CatalystConf? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209290096 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209290099 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55701/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209290084 **[Test build #55701 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55701/consoleFull)** for PR 12353 at commit [`9a83c7f`](https://github.com/apache/spark/commit/9a83c7f93c89fad7c486ad3ddd99b9abbef1d149). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12353#issuecomment-209289419 **[Test build #55701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55701/consoleFull)** for PR 12353 at commit [`9a83c7f`](https://github.com/apache/spark/commit/9a83c7f93c89fad7c486ad3ddd99b9abbef1d149). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14577][SQL] Add spark.sql.codegen.maxCa...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/12353 [SPARK-14577][SQL] Add spark.sql.codegen.maxCaseBranches config option ## What changes were proposed in this pull request? We currently disable codegen for `CaseWhen` if the number of branches is greater than 20 (in CaseWhen.MAX_NUM_CASES_FOR_CODEGEN). It would be better if this value is a non-public config defined in SQLConf. ## How was this patch tested? Pass the Jenkins tests (including a new testcase `Support spark.sql.codegen.maxCaseBranches option`) You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-14577 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12353.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12353 commit 9a83c7f93c89fad7c486ad3ddd99b9abbef1d149 Author: Dongjoon Hyun Date: 2016-04-13T00:20:22Z [SPARK-14577][SQL] Add spark.sql.codegen.maxCaseBranches config option --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org