[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11484 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194979845 LGTM, merging into master, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194668304 @davies Comments are addressed. Please let me know if you have further comments. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194659524 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194659527 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52810/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194659293 **[Test build #52810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52810/consoleFull)** for PR 11484 at commit [`6f0ae35`](https://github.com/apache/spark/commit/6f0ae353a648f6b060b318726d550181bc40b4e2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r55628447 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala --- @@ -93,7 +93,7 @@ case class Expand( child.asInstanceOf[CodegenSupport].produce(ctx, this) } - override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = { --- End diff -- done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194639075 **[Test build #52810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52810/consoleFull)** for PR 11484 at commit [`6f0ae35`](https://github.com/apache/spark/commit/6f0ae353a648f6b060b318726d550181bc40b4e2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r55628054 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Sort.scala --- @@ -153,18 +155,22 @@ case class Sort( """.stripMargin.trim } - override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = { -val colExprs = child.output.zipWithIndex.map { case (attr, i) => - BoundReference(i, attr.dataType, attr.nullable) -} + override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: String): String = { +if (row != null) { + s"$sorterVariable.insertRow((UnsafeRow)$row.copy());" --- End diff -- Yes. I checked it. Will remove this copy cal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r55544426 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -166,6 +174,15 @@ trait CodegenSupport extends SparkPlan { input: Seq[ExprCode], row: String = null): String = { ctx.freshNamePrefix = variablePrefix +val realUsedInput = --- End diff -- val evaluate = if (row != null && consumeUnsafeRow) { // Current plan can consume UnsafeRows directly "" } else { evaluateRequiredVariables(child.output, inputVars, usedInput) } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r55544043 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -109,7 +114,10 @@ trait CodegenSupport extends SparkPlan { * Consume the columns generated from current SparkPlan, call it's parent. */ final def consume(ctx: CodegenContext, input: Seq[ExprCode], row: String = null): String = { -if (input != null) { +// We check if input expressions has same length as output when: +// 1. parent can't consume UnsafeRow and input is not null. +// 2. parent consumes UnsafeRow and row is null. +if ((input != null && !parent.consumeUnsafeRow) || (parent.consumeUnsafeRow && row == null)) { --- End diff -- Once input is not null, it have to have the required length. We should passing `null` instead of empty seq. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r55543800 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -65,7 +65,12 @@ trait CodegenSupport extends SparkPlan { /** * Which SparkPlan is calling produce() of this one. It's itself for the first SparkPlan. */ - private var parent: CodegenSupport = null + protected var parent: CodegenSupport = null + + /** +* Whether this SparkPlan accepts UnsafeRow as input in doConsume. +*/ + def consumeUnsafeRow: Boolean = false --- End diff -- preferUnsafeRow ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r55542998 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Sort.scala --- @@ -153,18 +155,22 @@ case class Sort( """.stripMargin.trim } - override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = { -val colExprs = child.output.zipWithIndex.map { case (attr, i) => - BoundReference(i, attr.dataType, attr.nullable) -} + override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: String): String = { +if (row != null) { + s"$sorterVariable.insertRow((UnsafeRow)$row.copy());" --- End diff -- Do we need the copy here? I think the sorter will copy it by itself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194187748 @davies I've addressed your comments. I also made corresponding changes for #11274. Please see if this change is good now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194185154 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194185159 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52732/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194184613 **[Test build #52732 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52732/consoleFull)** for PR 11484 at commit [`6400eb2`](https://github.com/apache/spark/commit/6400eb22aefa986fb5d96d3d6d0242778e2f332a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194183811 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52733/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194183809 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194183565 **[Test build #52733 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52733/consoleFull)** for PR 11484 at commit [`dea644a`](https://github.com/apache/spark/commit/dea644a74251c62f1ccb0fd095083d434acd1a8c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194146675 **[Test build #52733 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52733/consoleFull)** for PR 11484 at commit [`dea644a`](https://github.com/apache/spark/commit/dea644a74251c62f1ccb0fd095083d434acd1a8c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-194145448 **[Test build #52732 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52732/consoleFull)** for PR 11484 at commit [`6400eb2`](https://github.com/apache/spark/commit/6400eb22aefa986fb5d96d3d6d0242778e2f332a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user kiszk commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-192173793 @viirya thank you for your explanation. I understood that this PR supports sort and operations regarding whole stage code generation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-192155333 @kiszk this is not just for Sort operator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user kiszk commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-192154924 Is it better to add "in sort" in a title of this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r54997367 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -109,7 +114,10 @@ trait CodegenSupport extends SparkPlan { * Consume the columns generated from current SparkPlan, call it's parent. */ final def consume(ctx: CodegenContext, input: Seq[ExprCode], row: String = null): String = { -if (input != null) { +// We check if input expressions has same length as output when: +// 1. parent can't consume UnsafeRow and input is not null. +// 2. parent consumes UnsafeRow and row is null. +if ((input != null && !parent.consumeUnsafeRow) || (parent.consumeUnsafeRow && row == null)) { --- End diff -- When the child knows its parent can consume UnsafeRow, it can choose to pass an UnsafeRow and empty input. If so, we don't need to check if `input.length == output.length` here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-192148712 @davies Yea. That will be good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-192146205 @viirya Can we wait for #11274 , then we could avoid some complicity. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r54996591 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -125,14 +133,27 @@ trait CodegenSupport extends SparkPlan { row: String = null): String = { ctx.freshNamePrefix = variablePrefix if (row != null) { - ctx.currentVars = null - ctx.INPUT_ROW = row - val evals = child.output.zipWithIndex.map { case (attr, i) => -BoundReference(i, attr.dataType, attr.nullable).gen(ctx) + val evals: Seq[ExprCode] = if (!consumeUnsafeRow) { +// If this SparkPlan can't consume UnsafeRow and there is an UnsafeRow, +// we extract the columns from the row and call doConsume. +ctx.currentVars = null +ctx.INPUT_ROW = row +child.output.zipWithIndex.map { case (attr, i) => + BoundReference(i, attr.dataType, attr.nullable).gen(ctx) +} + } else { +// If this SparkPlan consumes UnsafeRow and there is an UnsafeRow, +// we don't need to unpack variables from the row. +Seq.empty + } + val evalCode = if (evals.isEmpty) { +"" + } else { +s"${evals.map(_.code).mkString("\n")}" } s""" - | ${evals.map(_.code).mkString("\n")} - | ${doConsume(ctx, evals)} + | $evalCode + | ${doConsume(ctx, evals, row)} """.stripMargin } else { doConsume(ctx, input) --- End diff -- pass `null` here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r54996609 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -125,14 +133,27 @@ trait CodegenSupport extends SparkPlan { row: String = null): String = { ctx.freshNamePrefix = variablePrefix if (row != null) { - ctx.currentVars = null - ctx.INPUT_ROW = row - val evals = child.output.zipWithIndex.map { case (attr, i) => -BoundReference(i, attr.dataType, attr.nullable).gen(ctx) + val evals: Seq[ExprCode] = if (!consumeUnsafeRow) { +// If this SparkPlan can't consume UnsafeRow and there is an UnsafeRow, +// we extract the columns from the row and call doConsume. +ctx.currentVars = null +ctx.INPUT_ROW = row +child.output.zipWithIndex.map { case (attr, i) => + BoundReference(i, attr.dataType, attr.nullable).gen(ctx) +} + } else { +// If this SparkPlan consumes UnsafeRow and there is an UnsafeRow, +// we don't need to unpack variables from the row. +Seq.empty + } + val evalCode = if (evals.isEmpty) { +"" + } else { +s"${evals.map(_.code).mkString("\n")}" } s""" - | ${evals.map(_.code).mkString("\n")} - | ${doConsume(ctx, evals)} + | $evalCode + | ${doConsume(ctx, evals, row)} """.stripMargin } else { doConsume(ctx, input) --- End diff -- pass null here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r54996552 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala --- @@ -93,7 +93,7 @@ case class Expand( child.asInstanceOf[CodegenSupport].produce(ctx, this) } - override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = { --- End diff -- I think we can remove the default value for row here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r54996447 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -109,7 +114,10 @@ trait CodegenSupport extends SparkPlan { * Consume the columns generated from current SparkPlan, call it's parent. */ final def consume(ctx: CodegenContext, input: Seq[ExprCode], row: String = null): String = { -if (input != null) { +// We check if input expressions has same length as output when: +// 1. parent can't consume UnsafeRow and input is not null. +// 2. parent consumes UnsafeRow and row is null. +if ((input != null && !parent.consumeUnsafeRow) || (parent.consumeUnsafeRow && row == null)) { --- End diff -- Why do we need to change this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11484#discussion_r54996174 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala --- @@ -67,7 +67,12 @@ trait CodegenSupport extends SparkPlan { /** * Which SparkPlan is calling produce() of this one. It's itself for the first SparkPlan. */ - private var parent: CodegenSupport = null + protected var parent: CodegenSupport = null + + /** +* Whether this SparkPlan accepts UnsafeRow as input in consumeChild. --- End diff -- consumeChild -> doConsume --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-192128562 cc @davies @rxin @nongli --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191642595 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191642597 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52377/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191642344 **[Test build #52377 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52377/consoleFull)** for PR 11484 at commit [`6941eb1`](https://github.com/apache/spark/commit/6941eb1370b7bbba35dcb45dd5ae59b43bfedf40). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191597934 **[Test build #52377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52377/consoleFull)** for PR 11484 at commit [`6941eb1`](https://github.com/apache/spark/commit/6941eb1370b7bbba35dcb45dd5ae59b43bfedf40). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191597456 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191596933 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191596934 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52367/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11484#issuecomment-191596814 **[Test build #52367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52367/consoleFull)** for PR 11484 at commit [`6941eb1`](https://github.com/apache/spark/commit/6941eb1370b7bbba35dcb45dd5ae59b43bfedf40). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org