[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163903552 **[Test build #47578 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47578/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163903666 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47578/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163903664 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47351058 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- ah that's it, the result type may be primitive and we should not assign null value to in, or NPE will happen. Should we create a JIRA for it? I think it's a different bug comparing to the one you fixed in #10259 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163983570 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163984799 **[Test build #2204 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2204/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163972430 Works for me. Thanks, guys! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47374514 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- Seems we are fine because we check `if (!${ev.isNull})` first? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163986683 **[Test build #47584 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47584/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47428568 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- sorry for being vague, I was trying to explain why the NPE happened and @davies has fixed it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163874386 hi @markhamstra , can you add some log in your UDF, to see if the NPE occurred before run into your UDF code or after? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163876482 @cloud-fan @markhamstra They should be all fixed (handling null in arguments and results). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163996602 **[Test build #47584 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47584/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163996672 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47584/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163997363 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163998492 **[Test build #47585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47585/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163996670 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163997902 **[Test build #2205 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2205/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47391422 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- My understanding is that he was trying to explain the NPE reported by @markhamstra. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-164021742 **[Test build #47585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47585/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-164022075 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-164022078 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47585/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47392510 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- uh, the value of `${ctx.defaultValue(dataType)}` is not `null` but `-1` when data type is `Integer`. I do not have more questions. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10266 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47388587 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- It will not cause NPE, but a compilation error? For example, if `dataType` is `Integer`, line 1049 will be ```int ev.value = null``` It will trigger a compilation error "incompatible types". Right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-164020361 **[Test build #2205 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2205/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `class ExecutorClassLoader(`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47391109 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- I can understand your fix, but I am trying to see what @cloud-fan said above. It sounds like he found another issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47391269 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- @gatorsmile At line 1049, we are using the default value. For primitive types, it will not be null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-164009410 **[Test build #2204 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2204/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `class ExecutorClassLoader(`\n * `case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends LeafExpression`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/10266#discussion_r47389510 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -1029,24 +1029,27 @@ case class ScalaUDF( // such as IntegerType, its javaType is `int` and the returned type of user-defined // function is Object. Trying to convert an Object to `int` will cause casting exception. val evalCode = evals.map(_.code).mkString -val funcArguments = converterTerms.zipWithIndex.map { - case (converter, i) => -val eval = evals(i) -val dt = children(i).dataType -s"$converter.apply(${eval.isNull} ? null : (${ctx.boxedType(dt)}) ${eval.value})" -}.mkString(",") -val callFunc = s"${ctx.boxedType(ctx.javaType(dataType))} $resultTerm = " + - s"(${ctx.boxedType(ctx.javaType(dataType))})${catalystConverterTerm}" + -s".apply($funcTerm.apply($funcArguments));" +val (converters, funcArguments) = converterTerms.zipWithIndex.map { case (converter, i) => + val eval = evals(i) + val argTerm = ctx.freshName("arg") + val convert = s"Object $argTerm = ${eval.isNull} ? null : $converter.apply(${eval.value});" + (convert, argTerm) +}.unzip -evalCode + s""" - ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; - Boolean ${ev.isNull}; +val callFunc = s"${ctx.boxedType(dataType)} $resultTerm = " + + s"(${ctx.boxedType(dataType)})${catalystConverterTerm}" + +s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" +s""" + $evalCode + ${converters.mkString("\n")} $callFunc - ${ev.value} = $resultTerm; - ${ev.isNull} = $resultTerm == null; + boolean ${ev.isNull} = $resultTerm == null; + ${ctx.javaType(dataType)} ${ev.value} = ${ctx.defaultValue(dataType)}; + if (!${ev.isNull}) { +${ev.value} = $resultTerm; --- End diff -- It's `Integer b = null; int a = (Integer) b;` , then NPE --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163879143 **[Test build #47578 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47578/consoleFull)** for PR 10266 at commit [`2125a1b`](https://github.com/apache/spark/commit/2125a1bac7edd75c1602806089cee6eff2e11660). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163792666 @gatorsmile Because your UDF is using primitive types, we have to chance to pass `null` in. In order to got the behavior you expect, you should change the UDF to use boxed type as parameters. Does it work for you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163794997 Ok, I see your point. This is a possible workaround. It works well when the input values of primitive types are not null. I am just afraid how users know this? We might see more related JIRAs in the future. Is it documented? Or any way to avoid it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10259 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163787042 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163787044 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47551/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163787017 Thank you @davies ! I guess we might still have a bug in the code. As long as any input variable is Null, the return result is `null`. Is that by design? For example, we have an input value `3`, but the return result is `Row(null)`. ```scala test("SPARK-12258 UDF and Null value") { hiveContext.runSqlHive("CREATE TABLE test (ti TINYINT, si SMALLINT, i INT, bi BIGINT, " + "bo BOOLEAN, f FLOAT, d DOUBLE, s STRING, bin BINARY, t TIMESTAMP, da DATE)" + "STORED AS TEXTFILE") hiveContext.runSqlHive("INSERT INTO TABLE test VALUES(Null, Null, 3, Null, Null, " + "Null, Null, Null, Null, Null, Null)") hiveContext.udf.register("typeNullCheck", (ti: Byte, si: Short, i: Int, bi: Long, bo: Boolean, f: Float, d: Double, s: String, bin: Array[Byte], t: Timestamp, da: Date) => (ti, si, i, bi, bo, f, d, s, bin, t, da)) checkAnswer( sql("SELECT typeNullCheck(ti, si, i, bi, bo, f, d, s, bin, t, da) FROM test"), Row(null, null, null, null, null, null, null, null, null, null, null)) } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/10259 [SPARK-12258] [SQL] passing null into ScalaUDF Check nullability and passing them into ScalaUDF. Closes #10249 You can merge this pull request into a Git repository by running: $ git pull https://github.com/davies/spark udf_null Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10259.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10259 commit 4636fe3a61f86f2a313505ee771a0d2cda357360 Author: Davies LiuDate: 2015-12-10T23:18:37Z passing null into ScalaUDF --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163786748 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163788875 Yeah. Below is the result of ```scala sql("SELECT * FROM test").show(); ``` ``` +++---+++++++++ | ti| si| i| bi| bo| f| d| s| bin| t| da| +++---+++++++++ |null|null| 3|null|null|null|null|null|null|null|null| +++---+++++++++ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163790754 the analyzer rule was introduced by https://github.com/apache/spark/pull/9770 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163797546 LGTM pending jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163801607 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163787742 **[Test build #47552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47552/consoleFull)** for PR 10259 at commit [`4636fe3`](https://github.com/apache/spark/commit/4636fe3a61f86f2a313505ee771a0d2cda357360). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163796308 I tried use `java.lang` types and the result look good. @davies how about we update our doc (the `Data Types` section) to explain that those primitive types are not nullable. Specially, for those scala primitive types, even if users provide null, scala will convert them to the default value. We can do the doc update in another PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163802039 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163802040 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47552/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163801918 **[Test build #47552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47552/consoleFull)** for PR 10259 at commit [`4636fe3`](https://github.com/apache/spark/commit/4636fe3a61f86f2a313505ee771a0d2cda357360). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163788347 Does `INSERT INTO TABLE test VALUES` actually work? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163861470 @davies The exact same UDF worked fine in 1.5. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163862856 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163862544 **[Test build #47572 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47572/consoleFull)** for PR 10266 at commit [`c0f85bb`](https://github.com/apache/spark/commit/c0f85bb676443cd0d6d73f8beb4139fb062fef8c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163862860 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47572/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163868756 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163868667 **[Test build #47574 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47574/consoleFull)** for PR 10266 at commit [`c96b512`](https://github.com/apache/spark/commit/c96b512a8a30575b24e8e9dbba24e0ac7ae16121). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163858860 Still doesn't work for me. Now it ends up in a different place, but a NPE: ``` ... 2015-12-11 06:48:09,285 INFO org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection: Code generated in 145.67804 ms 2015-12-11 06:48:09,297 INFO org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection: Code generated in 4.438909 ms 2015-12-11 06:48:09,305 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.random.SamplingUtils$.reservoirSampleAndCount(SamplingUtils.scala:42) at org.apache.spark.RangePartitioner$$anonfun$9.apply(Partitioner.scala:261) at org.apache.spark.RangePartitioner$$anonfun$9.apply(Partitioner.scala:259) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$22.apply(RDD.scala:745) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$22.apply(RDD.scala:745) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-12-11 06:48:09,325 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.random.SamplingUtils$.reservoirSampleAndCount(SamplingUtils.scala:42) at org.apache.spark.RangePartitioner$$anonfun$9.apply(Partitioner.scala:261) at org.apache.spark.RangePartitioner$$anonfun$9.apply(Partitioner.scala:259) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$22.apply(RDD.scala:745) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$22.apply(RDD.scala:745) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ... ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163860387 @markhamstra I think it's because of your UDF did not handle null correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163867183 **[Test build #2202 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2202/consoleFull)** for PR 10266 at commit [`c96b512`](https://github.com/apache/spark/commit/c96b512a8a30575b24e8e9dbba24e0ac7ae16121). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `class ExecutorClassLoader(`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163868758 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47574/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855637 No problem; I'll cherry-pick another. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855563 @markhamstra Sorry, just pushed a commit to fix it now, added a regression test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855762 @markhamstra Once it works, I will merge this to unblock RC2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163843237 Sorry, but after this I am now seeing codeGen errors. Like this: ``` testTimestampArithmeticUDFs(com.clearstorydata.dataengine.udfs.TestTimestampArithmeticUDFs) Time elapsed: 3.84 sec <<< FAILURE! org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange rangepartitioning(difference#0L ASC,8), None +- ConvertToSafe +- Project [TIMESTAMPDIFF(second,ts#9,TIMESTAMPADD(second,-1,ts#9)) AS difference#0L] +- HiveTableScan [ts#9], MetastoreRelation default, ts_test, None at com.clearstorydata.dataengine.udfs.TestTimestampArithmeticUDFs.testTimestampArithmeticUDFs(TestTimestampArithmeticUDFs.scala:31) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.util.concurrent.ExecutionException: java.lang.Exception: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 78, Column 154: Incompatible expression types "void" and "int" /* 001 */ /* 002 */ public java.lang.Object generate(org.apache.spark.sql.catalyst.expressions.Expression[] exprs) { /* 003 */ return new SpecificUnsafeProjection(exprs); /* 004 */ } /* 005 */ /* 006 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection { /* 007 */ /* 008 */ private org.apache.spark.sql.catalyst.expressions.Expression[] expressions; /* 009 */ /* 010 */ private scala.Function1 catalystConverter2; /* 011 */ private scala.Function1 converter4; /* 012 */ private scala.Function1 converter5; /* 013 */ private scala.Function1 converter6; /* 014 */ private scala.Function3 udf7; /* 015 */ private scala.Function1 catalystConverter15; /* 016 */ private scala.Function1 converter17; /* 017 */ private scala.Function1 converter18; /* 018 */ private scala.Function1 converter19; /* 019 */ private scala.Function3 udf20; /* 020 */ private UnsafeRow result28; /* 021 */ private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder bufferHolder29; /* 022 */ private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter rowWriter30; /* 023 */ /* 024 */ /* 025 */ /* 026 */ public SpecificUnsafeProjection(org.apache.spark.sql.catalyst.expressions.Expression[] expressions) { /* 027 */ this.expressions = expressions; /* 028 */ this.catalystConverter2 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToCatalystConverter(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[0]).dataType()); /* 029 */ this.converter4 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToScalaConverter(((org.apache.spark.sql.catalyst.expressions.Expression)(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[0]).getChildren().apply(0))).dataType()); /* 030 */ this.converter5 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToScalaConverter(((org.apache.spark.sql.catalyst.expressions.Expression)(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[0]).getChildren().apply(1))).dataType()); /* 031 */ this.converter6 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToScalaConverter(((org.apache.spark.sql.catalyst.expressions.Expression)(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[0]).getChildren().apply(2))).dataType()); /* 032 */ this.udf7 = (scala.Function3)(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[0]).userDefinedFunc()); /* 033 */ this.catalystConverter15 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToCatalystConverter(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[2]).dataType()); /* 034 */ this.converter17 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToScalaConverter(((org.apache.spark.sql.catalyst.expressions.Expression)(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[2]).getChildren().apply(0))).dataType()); /* 035 */ this.converter18 = (scala.Function1)org.apache.spark.sql.catalyst.CatalystTypeConverters$.MODULE$.createToScalaConverter(((org.apache.spark.sql.catalyst.expressions.Expression)(((org.apache.spark.sql.catalyst.expressions.ScalaUDF)expressions[2]).getChildren().apply(1))).dataType()); /* 036 */ this.converter19 =
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/10266 [SPARK-12258] [SQL] passing null into ScalaUDF (follow-up) This is a follow-up PR for #10259 You can merge this pull request into a Git repository by running: $ git pull https://github.com/davies/spark null_udf2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10266.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10266 commit c0f85bb676443cd0d6d73f8beb4139fb062fef8c Author: Davies LiuDate: 2015-12-11T05:44:23Z fix passing null into ScalaUDF --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163846453 @markhamstra Does https://github.com/apache/spark/pull/10266 fix it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163848082 Could you try to add `(Integer)` before `null`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163851041 Wenchen, should the type of i be Integer? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163852763 I changed `int` to `Integer` and tried again ,the result is the same. And I also tried `Integer i = (Integer) -1;` which also failed to compile. I think the problem is when we use negative literal with explicit type cast, the `-` are mistakenly parsed and we need to wrap it with `()`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855486 **[Test build #2202 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2202/consoleFull)** for PR 10266 at commit [`c96b512`](https://github.com/apache/spark/commit/c96b512a8a30575b24e8e9dbba24e0ac7ae16121). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855323 It's not a Janino bug, `(Integer)-1` does not work in Java, faint :-( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855481 @davies This results in a slightly different failure from the one I previously reported: Everything looks the same as the prior post except now: ``` failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 78, Column 193: Expression "java.lang.Integer" is not an rvalue . . . /* 078 */ Long result16 = (Long)catalystConverter15.apply(udf20.apply(converter17.apply(isNull21 ? (UTF8String) null : (UTF8String) primitive22),converter18.apply(false ? (Integer) null : (Integer) -1),converter19.apply(isNull26 ? (Long) null : (Long) primitive27))); . . . ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855851 LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163855980 **[Test build #47574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47574/consoleFull)** for PR 10266 at commit [`c96b512`](https://github.com/apache/spark/commit/c96b512a8a30575b24e8e9dbba24e0ac7ae16121). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163844582 hi @markhamstra , can you share you test code? so that we can reproduce it, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163847683 **[Test build #47572 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47572/consoleFull)** for PR 10266 at commit [`c0f85bb`](https://github.com/apache/spark/commit/c0f85bb676443cd0d6d73f8beb4139fb062fef8c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/10266#issuecomment-163847721 I tried it locally, here is my findings: * `int i = false ? null : (Integer) -1;` **doesn't compile** * `int i = false ? null : (Integer) 1;` compiles * `int i = false ? null : (Integer) t;` compiles * `int i = false ? null : (Integer) (-1);` **compiles** So I think a simple fix is just adding `()` around `${eval.value}`, but I can't think of a test case to reproduce it... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12258] [SQL] passing null into ScalaUDF
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/10259#issuecomment-163856161 @davies No -- see the other PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org