[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22789


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/22789#discussion_r228760802
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala
 ---
@@ -319,4 +319,15 @@ class WholeStageCodegenSuite extends QueryTest with 
SharedSQLContext {
   assert(df.limit(1).collect() === Array(Row("bat", 8.0)))
 }
   }
+
+  test("SPARK-25767: Lazy evaluated stream of expressions handled 
correctly") {
--- End diff --

oh wait, I executed those commands in sparkShell and that falls back to 
interpreted mode when compilation fails. That is what happened here. Sorry 
about the fuss.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-28 Thread peter-toth
Github user peter-toth commented on a diff in the pull request:

https://github.com/apache/spark/pull/22789#discussion_r228753985
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -146,7 +146,10 @@ trait CodegenSupport extends SparkPlan {
   if (outputVars != null) {
 assert(outputVars.length == output.length)
 // outputVars will be used to generate the code for UnsafeRow, so 
we should copy them
-outputVars.map(_.copy())
+outputVars.map(_.copy()) match {
--- End diff --

What I found is that in `prepareRowVar` the `evaluateVariables` call clears 
the code of `outputVars`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-28 Thread peter-toth
Github user peter-toth commented on a diff in the pull request:

https://github.com/apache/spark/pull/22789#discussion_r228753682
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala
 ---
@@ -319,4 +319,15 @@ class WholeStageCodegenSuite extends QueryTest with 
SharedSQLContext {
   assert(df.limit(1).collect() === Array(Row("bat", 8.0)))
 }
   }
+
+  test("SPARK-25767: Lazy evaluated stream of expressions handled 
correctly") {
--- End diff --

Strange, I've just rerun it on master and got:
```
08:34:45.754 ERROR 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to 
compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', 
Line 81, Column 17: Expression "bhj_isNull_6" is not an rvalue
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
81, Column 17: Expression "bhj_isNull_6" is not an rvalue
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/22789#discussion_r228749168
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -146,7 +146,10 @@ trait CodegenSupport extends SparkPlan {
   if (outputVars != null) {
 assert(outputVars.length == output.length)
 // outputVars will be used to generate the code for UnsafeRow, so 
we should copy them
-outputVars.map(_.copy())
+outputVars.map(_.copy()) match {
--- End diff --

What in `consume` is relying on a side effect of traversing the 
`outputVars`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/22789#discussion_r228748979
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala
 ---
@@ -319,4 +319,15 @@ class WholeStageCodegenSuite extends QueryTest with 
SharedSQLContext {
   assert(df.limit(1).collect() === Array(Row("bat", 8.0)))
 }
   }
+
+  test("SPARK-25767: Lazy evaluated stream of expressions handled 
correctly") {
--- End diff --

Even without this patch this test would pass (I just tried it on master). A 
stream always evaluates its first element, so you probably need to add another 
key here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22789: [SPARK-25767][SQL] Fix lazily evaluated stream of...

2018-10-22 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22789#discussion_r227008114
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -146,7 +146,7 @@ trait CodegenSupport extends SparkPlan {
   if (outputVars != null) {
 assert(outputVars.length == output.length)
 // outputVars will be used to generate the code for UnsafeRow, so 
we should copy them
-outputVars.map(_.copy())
+outputVars.map(_.copy()).toBuffer
--- End diff --

can we do something like:
```
val res = outputVars.map(_.copy())
res  match {
 case x: Stream => x.force
 case o => o
}
```

in order to avoid doing the `toBuffer` operation when not needed (please 
use more meaningful names than in my example in case it is feasible :) )?

cc @hvanhovell who I remember fixed a similar issue with `Stream`s.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org