[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22315 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214647587 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,33 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; -} + def checkAndSetIsNullCode(body: String) = if (nullable) { +s""" + |if ($arr.isNullAt($i)) { + |${ev.isNull} = true; + |} else { + | $body + |} + """.stripMargin + } else { +body } - """ + val unsetIsNullCode = if (nullable) s"${ev.isNull} = false;" else "" + val code = checkAndSetIsNullCode( --- End diff -- This seems too complicated to save a few duplicated code, how about ``` val loopBody = if (nullable) { s""" |if ($arr.isNullAt($i)) { | ${ev.isNull} = true; |} else ... """ } else { s""" |if (${ctx.genEqual(right.dataType, value, getValue)}) {... """ } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214596427 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,35 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; + def checkAndSetIsNullCode(body: String) = { +if (nullable) { + s""" + |if ($arr.isNullAt($i)) { --- End diff -- @maropu This should be rejected in checkInputDataTypes(), no ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214596495 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,35 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; + def checkAndSetIsNullCode(body: String) = { --- End diff -- @maropu ok.. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214564677 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,35 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; + def checkAndSetIsNullCode(body: String) = { --- End diff -- nit: How about ` def checkAndSetIsNullCode(body: String) = if (nullable) {`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214564353 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,35 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; + def checkAndSetIsNullCode(body: String) = { +if (nullable) { + s""" + |if ($arr.isNullAt($i)) { --- End diff -- How about the case `right.nullable = true` and `left.nullable = false AND left.dataType.asInstanceOf[ArrayType].containsNull = false`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214542052 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,27 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; -} + val checkAndSetIsNullCode = if (nullable) { --- End diff -- Could you update this part based on [this comment](https://github.com/apache/spark/pull/21103#discussion_r205999519) instead of `if (...) { ... } else`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214532569 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1464,17 +1464,28 @@ case class ArrayContains(left: Expression, right: Expression) nullSafeCodeGen(ctx, ev, (arr, value) => { val i = ctx.freshName("i") val getValue = CodeGenerator.getValue(arr, right.dataType, i) - s""" - for (int $i = 0; $i < $arr.numElements(); $i ++) { -if ($arr.isNullAt($i)) { - ${ev.isNull} = true; -} else if (${ctx.genEqual(right.dataType, value, getValue)}) { - ${ev.isNull} = false; - ${ev.value} = true; - break; -} + val checkAndSetIsNullCode = if (nullable) { +s""" + |if ($arr.isNullAt($i)) { + |${ev.isNull} = true; + |} else + | --- End diff -- nit: extra newline --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22315#discussion_r214532578 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -383,10 +383,13 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper val a3 = Literal.create(null, ArrayType(StringType)) val a4 = Literal.create(Seq(create_row(1)), ArrayType(StructType(Seq( StructField("a", IntegerType, true) +// Explicitly mark the array type not nullable (spark-x) --- End diff -- nit: `spark-x`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22315: [SPARK-25308][SQL] ArrayContains function may ret...
GitHub user dilipbiswal opened a pull request: https://github.com/apache/spark/pull/22315 [SPARK-25308][SQL] ArrayContains function may return a error in the code generation phase. ## What changes were proposed in this pull request? Invoking ArrayContains function with non nullable array type throws the following error in the code generation phase. Below is the error snippet. ```SQL Code generation of array_contains([1,2,3], 1) failed: java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 11: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 11: Expression "isNull_0" is not an rvalue java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 11: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 11: Expression "isNull_0" is not an rvalue at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257) at com.google.common.cache.LocalCache.get(LocalCache.java:4000) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1305) ``` ## How was this patch tested? Added test in CollectionExpressionSuite. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dilipbiswal/spark SPARK-25308 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22315.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22315 commit 84b135c7176cd1affe3da449ede28adf182ae733 Author: Dilip Biswal Date: 2018-08-31T02:59:30Z [SPARK-25308] ArrayContains function may return a error in the code generation phase. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org