[
https://issues.apache.org/jira/browse/SPARK-48667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
angerszhu updated SPARK-48667:
------------------------------
Description:
{code:java}
df.select(udf(lambda x: x, returnType=ExamplePointUDT(),
useArrow=useArrow)("point")), {code}
{code:java}
java.lang.AssertionError: assertion failed: Invalid schema from pandas_udf:
expected org.apache.spark.sql.test.ExamplePointUDT@49ccc723,
StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true)),
got ArrayType(DoubleType,false)
{code}
was:
{code:java}
df.select(udf(lambda x: x, returnType=ExamplePointUDT(),
useArrow=useArrow)("point")), {code}
{code:java}
java.lang.AssertionError: assertion failed: Invalid schema from pandas_udf:
expected BooleanType, LongType, StringType, StringType, DateType,
TimestampType, DayTimeIntervalType(0,3), DoubleType, ArrayType(LongType,true),
BinaryType, StructType(StructField(x,LongType,true)),
org.apache.spark.sql.test.ExamplePointUDT@49ccc723,
StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true)),
got BooleanType, LongType, StringType, StringType, DateType, TimestampType,
DayTimeIntervalType(0,3), DoubleType, ArrayType(LongType,true), BinaryType,
StructType(StructField(x,LongType,true)), ArrayType(DoubleType,false),
StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true))
{code}
> Arrow python UDFS didn't support UDT as output type
> ---------------------------------------------------
>
> Key: SPARK-48667
> URL: https://issues.apache.org/jira/browse/SPARK-48667
> Project: Spark
> Issue Type: Sub-task
> Components: PySpark
> Affects Versions: 3.5.1, 3.4.3
> Reporter: angerszhu
> Priority: Major
>
> {code:java}
> df.select(udf(lambda x: x, returnType=ExamplePointUDT(),
> useArrow=useArrow)("point")), {code}
>
> {code:java}
> java.lang.AssertionError: assertion failed: Invalid schema from pandas_udf:
> expected org.apache.spark.sql.test.ExamplePointUDT@49ccc723,
> StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true)),
> got ArrayType(DoubleType,false)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]