[jira] [Updated] (SPARK-28441) PythonUDF used in correlated scalar subquery causes UnsupportedOperationException

2019-07-19 Thread Liang-Chi Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-28441:

Summary: PythonUDF used in correlated scalar subquery causes 
UnsupportedOperationException   (was: PythonUDF used in correlated scalar 
subquery causes )

> PythonUDF used in correlated scalar subquery causes 
> UnsupportedOperationException 
> --
>
> Key: SPARK-28441
> URL: https://issues.apache.org/jira/browse/SPARK-28441
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.0.0
>Reporter: Huaxin Gao
>Priority: Minor
>
> I found this when doing https://issues.apache.org/jira/browse/SPARK-28277
>  
> {code:java}
> >>> @pandas_udf("string", PandasUDFType.SCALAR)
> ... def noop(x):
> ...     return x.apply(str)
> ... 
> >>> spark.udf.register("udf", noop)
> 
> >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t1 as select * from values 
> >>> (\"one\", 1), (\"two\", 2),(\"three\", 3),(\"one\", NULL) as t1(k, v)")
> DataFrame[]
> >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t2 as select * from values 
> >>> (\"one\", 1), (\"two\", 22),(\"one\", 5),(\"one\", NULL), (NULL, 5) as 
> >>> t2(k, v)")
> DataFrame[]
> >>> spark.sql("SELECT t1.k FROM t1 WHERE  t1.v <= (SELECT   
> >>> udf(max(udf(t2.v))) FROM     t2 WHERE    udf(t2.k) = udf(t1.k))").show()
> py4j.protocol.Py4JJavaError: An error occurred while calling o65.showString.
> : java.lang.UnsupportedOperationException: Cannot evaluate expression: 
> udf(null)
>  at 
> org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:296)
>  at 
> org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:295)
>  at 
> org.apache.spark.sql.catalyst.expressions.PythonUDF.eval(PythonUDF.scala:52)
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28441) PythonUDF used in correlated scalar subquery causes UnsupportedOperationException

2019-07-19 Thread Liang-Chi Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-28441:

Priority: Major  (was: Minor)

> PythonUDF used in correlated scalar subquery causes 
> UnsupportedOperationException 
> --
>
> Key: SPARK-28441
> URL: https://issues.apache.org/jira/browse/SPARK-28441
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.0.0
>Reporter: Huaxin Gao
>Priority: Major
>
> I found this when doing https://issues.apache.org/jira/browse/SPARK-28277
>  
> {code:java}
> >>> @pandas_udf("string", PandasUDFType.SCALAR)
> ... def noop(x):
> ...     return x.apply(str)
> ... 
> >>> spark.udf.register("udf", noop)
> 
> >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t1 as select * from values 
> >>> (\"one\", 1), (\"two\", 2),(\"three\", 3),(\"one\", NULL) as t1(k, v)")
> DataFrame[]
> >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t2 as select * from values 
> >>> (\"one\", 1), (\"two\", 22),(\"one\", 5),(\"one\", NULL), (NULL, 5) as 
> >>> t2(k, v)")
> DataFrame[]
> >>> spark.sql("SELECT t1.k FROM t1 WHERE  t1.v <= (SELECT   
> >>> udf(max(udf(t2.v))) FROM     t2 WHERE    udf(t2.k) = udf(t1.k))").show()
> py4j.protocol.Py4JJavaError: An error occurred while calling o65.showString.
> : java.lang.UnsupportedOperationException: Cannot evaluate expression: 
> udf(null)
>  at 
> org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:296)
>  at 
> org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:295)
>  at 
> org.apache.spark.sql.catalyst.expressions.PythonUDF.eval(PythonUDF.scala:52)
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org