[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11484


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-10 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194979845
  
LGTM, merging into master, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194668304
  
@davies Comments are addressed. Please let me know if you have further 
comments. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194659524
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194659527
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52810/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194659293
  
**[Test build #52810 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52810/consoleFull)**
 for PR 11484 at commit 
[`6f0ae35`](https://github.com/apache/spark/commit/6f0ae353a648f6b060b318726d550181bc40b4e2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r55628447
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala ---
@@ -93,7 +93,7 @@ case class Expand(
 child.asInstanceOf[CodegenSupport].produce(ctx, this)
   }
 
-  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): 
String = {
--- End diff --

done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194639075
  
**[Test build #52810 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52810/consoleFull)**
 for PR 11484 at commit 
[`6f0ae35`](https://github.com/apache/spark/commit/6f0ae353a648f6b060b318726d550181bc40b4e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r55628054
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Sort.scala 
---
@@ -153,18 +155,22 @@ case class Sort(
  """.stripMargin.trim
   }
 
-  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): 
String = {
-val colExprs = child.output.zipWithIndex.map { case (attr, i) =>
-  BoundReference(i, attr.dataType, attr.nullable)
-}
+  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: 
String): String = {
+if (row != null) {
+  s"$sorterVariable.insertRow((UnsafeRow)$row.copy());"
--- End diff --

Yes. I checked it. Will remove this copy cal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r55544426
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -166,6 +174,15 @@ trait CodegenSupport extends SparkPlan {
   input: Seq[ExprCode],
   row: String = null): String = {
 ctx.freshNamePrefix = variablePrefix
+val realUsedInput =
--- End diff --

val evaluate = if (row != null && consumeUnsafeRow) {
  // Current plan can consume UnsafeRows directly
  ""
} else {
  evaluateRequiredVariables(child.output, inputVars, usedInput)
}


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r55544043
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -109,7 +114,10 @@ trait CodegenSupport extends SparkPlan {
 * Consume the columns generated from current SparkPlan, call it's 
parent.
 */
   final def consume(ctx: CodegenContext, input: Seq[ExprCode], row: String 
= null): String = {
-if (input != null) {
+// We check if input expressions has same length as output when:
+// 1. parent can't consume UnsafeRow and input is not null.
+// 2. parent consumes UnsafeRow and row is null.
+if ((input != null && !parent.consumeUnsafeRow) || 
(parent.consumeUnsafeRow && row == null)) {
--- End diff --

Once input is not null, it have to have the required length. We should 
passing `null` instead of empty seq.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r55543800
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -65,7 +65,12 @@ trait CodegenSupport extends SparkPlan {
   /**
 * Which SparkPlan is calling produce() of this one. It's itself for 
the first SparkPlan.
 */
-  private var parent: CodegenSupport = null
+  protected var parent: CodegenSupport = null
+
+  /**
+* Whether this SparkPlan accepts UnsafeRow as input in doConsume.
+*/
+  def consumeUnsafeRow: Boolean = false
--- End diff --

preferUnsafeRow ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r55542998
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Sort.scala 
---
@@ -153,18 +155,22 @@ case class Sort(
  """.stripMargin.trim
   }
 
-  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): 
String = {
-val colExprs = child.output.zipWithIndex.map { case (attr, i) =>
-  BoundReference(i, attr.dataType, attr.nullable)
-}
+  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: 
String): String = {
+if (row != null) {
+  s"$sorterVariable.insertRow((UnsafeRow)$row.copy());"
--- End diff --

Do we need the copy here? I think the sorter will copy it by itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194187748
  
@davies I've addressed your comments. I also made corresponding changes for 
#11274. Please see if this change is good now. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194185154
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194185159
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52732/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194184613
  
**[Test build #52732 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52732/consoleFull)**
 for PR 11484 at commit 
[`6400eb2`](https://github.com/apache/spark/commit/6400eb22aefa986fb5d96d3d6d0242778e2f332a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194183811
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52733/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194183809
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194183565
  
**[Test build #52733 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52733/consoleFull)**
 for PR 11484 at commit 
[`dea644a`](https://github.com/apache/spark/commit/dea644a74251c62f1ccb0fd095083d434acd1a8c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194146675
  
**[Test build #52733 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52733/consoleFull)**
 for PR 11484 at commit 
[`dea644a`](https://github.com/apache/spark/commit/dea644a74251c62f1ccb0fd095083d434acd1a8c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-194145448
  
**[Test build #52732 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52732/consoleFull)**
 for PR 11484 at commit 
[`6400eb2`](https://github.com/apache/spark/commit/6400eb22aefa986fb5d96d3d6d0242778e2f332a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-04 Thread kiszk
Github user kiszk commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-192173793
  
@viirya thank you for your explanation. I understood that this PR supports 
sort and operations regarding whole stage code generation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-192155333
  
@kiszk this is not just for Sort operator.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread kiszk
Github user kiszk commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-192154924
  
Is it better to add "in sort" in a title of this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r54997367
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -109,7 +114,10 @@ trait CodegenSupport extends SparkPlan {
 * Consume the columns generated from current SparkPlan, call it's 
parent.
 */
   final def consume(ctx: CodegenContext, input: Seq[ExprCode], row: String 
= null): String = {
-if (input != null) {
+// We check if input expressions has same length as output when:
+// 1. parent can't consume UnsafeRow and input is not null.
+// 2. parent consumes UnsafeRow and row is null.
+if ((input != null && !parent.consumeUnsafeRow) || 
(parent.consumeUnsafeRow && row == null)) {
--- End diff --

When the child knows its parent can consume UnsafeRow, it can choose to 
pass an UnsafeRow and empty input. If so, we don't need to check if 
`input.length == output.length` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-192148712
  
@davies Yea. That will be good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-192146205
  
@viirya Can we wait for #11274 , then we could avoid some complicity. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r54996591
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -125,14 +133,27 @@ trait CodegenSupport extends SparkPlan {
   row: String = null): String = {
 ctx.freshNamePrefix = variablePrefix
 if (row != null) {
-  ctx.currentVars = null
-  ctx.INPUT_ROW = row
-  val evals = child.output.zipWithIndex.map { case (attr, i) =>
-BoundReference(i, attr.dataType, attr.nullable).gen(ctx)
+  val evals: Seq[ExprCode] = if (!consumeUnsafeRow) {
+// If this SparkPlan can't consume UnsafeRow and there is an 
UnsafeRow,
+// we extract the columns from the row and call doConsume.
+ctx.currentVars = null
+ctx.INPUT_ROW = row
+child.output.zipWithIndex.map { case (attr, i) =>
+  BoundReference(i, attr.dataType, attr.nullable).gen(ctx)
+}
+  } else {
+// If this SparkPlan consumes UnsafeRow and there is an UnsafeRow,
+// we don't need to unpack variables from the row.
+Seq.empty
+  }
+  val evalCode = if (evals.isEmpty) {
+""
+  } else {
+s"${evals.map(_.code).mkString("\n")}"
   }
   s"""
- | ${evals.map(_.code).mkString("\n")}
- | ${doConsume(ctx, evals)}
+ | $evalCode
+ | ${doConsume(ctx, evals, row)}
""".stripMargin
 } else {
   doConsume(ctx, input)
--- End diff --

pass `null` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r54996609
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -125,14 +133,27 @@ trait CodegenSupport extends SparkPlan {
   row: String = null): String = {
 ctx.freshNamePrefix = variablePrefix
 if (row != null) {
-  ctx.currentVars = null
-  ctx.INPUT_ROW = row
-  val evals = child.output.zipWithIndex.map { case (attr, i) =>
-BoundReference(i, attr.dataType, attr.nullable).gen(ctx)
+  val evals: Seq[ExprCode] = if (!consumeUnsafeRow) {
+// If this SparkPlan can't consume UnsafeRow and there is an 
UnsafeRow,
+// we extract the columns from the row and call doConsume.
+ctx.currentVars = null
+ctx.INPUT_ROW = row
+child.output.zipWithIndex.map { case (attr, i) =>
+  BoundReference(i, attr.dataType, attr.nullable).gen(ctx)
+}
+  } else {
+// If this SparkPlan consumes UnsafeRow and there is an UnsafeRow,
+// we don't need to unpack variables from the row.
+Seq.empty
+  }
+  val evalCode = if (evals.isEmpty) {
+""
+  } else {
+s"${evals.map(_.code).mkString("\n")}"
   }
   s"""
- | ${evals.map(_.code).mkString("\n")}
- | ${doConsume(ctx, evals)}
+ | $evalCode
+ | ${doConsume(ctx, evals, row)}
""".stripMargin
 } else {
   doConsume(ctx, input)
--- End diff --

pass null here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r54996552
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala ---
@@ -93,7 +93,7 @@ case class Expand(
 child.asInstanceOf[CodegenSupport].produce(ctx, this)
   }
 
-  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): 
String = {
--- End diff --

I think we can remove the default value for row here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r54996447
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -109,7 +114,10 @@ trait CodegenSupport extends SparkPlan {
 * Consume the columns generated from current SparkPlan, call it's 
parent.
 */
   final def consume(ctx: CodegenContext, input: Seq[ExprCode], row: String 
= null): String = {
-if (input != null) {
+// We check if input expressions has same length as output when:
+// 1. parent can't consume UnsafeRow and input is not null.
+// 2. parent consumes UnsafeRow and row is null.
+if ((input != null && !parent.consumeUnsafeRow) || 
(parent.consumeUnsafeRow && row == null)) {
--- End diff --

Why do we need to change this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11484#discussion_r54996174
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -67,7 +67,12 @@ trait CodegenSupport extends SparkPlan {
   /**
 * Which SparkPlan is calling produce() of this one. It's itself for 
the first SparkPlan.
 */
-  private var parent: CodegenSupport = null
+  protected var parent: CodegenSupport = null
+
+  /**
+* Whether this SparkPlan accepts UnsafeRow as input in consumeChild.
--- End diff --

consumeChild -> doConsume


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-192128562
  
cc @davies @rxin @nongli 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191642595
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191642597
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52377/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191642344
  
**[Test build #52377 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52377/consoleFull)**
 for PR 11484 at commit 
[`6941eb1`](https://github.com/apache/spark/commit/6941eb1370b7bbba35dcb45dd5ae59b43bfedf40).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191597934
  
**[Test build #52377 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52377/consoleFull)**
 for PR 11484 at commit 
[`6941eb1`](https://github.com/apache/spark/commit/6941eb1370b7bbba35dcb45dd5ae59b43bfedf40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-02 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191597456
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191596933
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191596934
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52367/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13636][SQL] Directly consume UnsafeRow ...

2016-03-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11484#issuecomment-191596814
  
**[Test build #52367 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52367/consoleFull)**
 for PR 11484 at commit 
[`6941eb1`](https://github.com/apache/spark/commit/6941eb1370b7bbba35dcb45dd5ae59b43bfedf40).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org