[jira] [Updated] (SPARK-28441) PythonUDF used in correlated scalar subquery causes UnsupportedOperationException
[ https://issues.apache.org/jira/browse/SPARK-28441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-28441: Summary: PythonUDF used in correlated scalar subquery causes UnsupportedOperationException (was: PythonUDF used in correlated scalar subquery causes ) > PythonUDF used in correlated scalar subquery causes > UnsupportedOperationException > -- > > Key: SPARK-28441 > URL: https://issues.apache.org/jira/browse/SPARK-28441 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.0.0 >Reporter: Huaxin Gao >Priority: Minor > > I found this when doing https://issues.apache.org/jira/browse/SPARK-28277 > > {code:java} > >>> @pandas_udf("string", PandasUDFType.SCALAR) > ... def noop(x): > ... return x.apply(str) > ... > >>> spark.udf.register("udf", noop) > > >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t1 as select * from values > >>> (\"one\", 1), (\"two\", 2),(\"three\", 3),(\"one\", NULL) as t1(k, v)") > DataFrame[] > >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t2 as select * from values > >>> (\"one\", 1), (\"two\", 22),(\"one\", 5),(\"one\", NULL), (NULL, 5) as > >>> t2(k, v)") > DataFrame[] > >>> spark.sql("SELECT t1.k FROM t1 WHERE t1.v <= (SELECT > >>> udf(max(udf(t2.v))) FROM t2 WHERE udf(t2.k) = udf(t1.k))").show() > py4j.protocol.Py4JJavaError: An error occurred while calling o65.showString. > : java.lang.UnsupportedOperationException: Cannot evaluate expression: > udf(null) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:296) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:295) > at > org.apache.spark.sql.catalyst.expressions.PythonUDF.eval(PythonUDF.scala:52) > {code} > > > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28441) PythonUDF used in correlated scalar subquery causes UnsupportedOperationException
[ https://issues.apache.org/jira/browse/SPARK-28441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-28441: Priority: Major (was: Minor) > PythonUDF used in correlated scalar subquery causes > UnsupportedOperationException > -- > > Key: SPARK-28441 > URL: https://issues.apache.org/jira/browse/SPARK-28441 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.0.0 >Reporter: Huaxin Gao >Priority: Major > > I found this when doing https://issues.apache.org/jira/browse/SPARK-28277 > > {code:java} > >>> @pandas_udf("string", PandasUDFType.SCALAR) > ... def noop(x): > ... return x.apply(str) > ... > >>> spark.udf.register("udf", noop) > > >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t1 as select * from values > >>> (\"one\", 1), (\"two\", 2),(\"three\", 3),(\"one\", NULL) as t1(k, v)") > DataFrame[] > >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t2 as select * from values > >>> (\"one\", 1), (\"two\", 22),(\"one\", 5),(\"one\", NULL), (NULL, 5) as > >>> t2(k, v)") > DataFrame[] > >>> spark.sql("SELECT t1.k FROM t1 WHERE t1.v <= (SELECT > >>> udf(max(udf(t2.v))) FROM t2 WHERE udf(t2.k) = udf(t1.k))").show() > py4j.protocol.Py4JJavaError: An error occurred while calling o65.showString. > : java.lang.UnsupportedOperationException: Cannot evaluate expression: > udf(null) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:296) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:295) > at > org.apache.spark.sql.catalyst.expressions.PythonUDF.eval(PythonUDF.scala:52) > {code} > > > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org