[ 
https://issues.apache.org/jira/browse/SPARK-48667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angerszhu updated SPARK-48667:
------------------------------
    Description: 
{code:java}
df.select(udf(lambda x: x, returnType=ExamplePointUDT(), 
useArrow=useArrow)("point")), {code}
 
{code:java}
 java.lang.AssertionError: assertion failed: Invalid schema from pandas_udf: 
expected BooleanType, LongType, StringType, StringType, DateType, 
TimestampType, DayTimeIntervalType(0,3), DoubleType, ArrayType(LongType,true), 
BinaryType, StructType(StructField(x,LongType,true)), 
org.apache.spark.sql.test.ExamplePointUDT@49ccc723, 
StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true)),
 got BooleanType, LongType, StringType, StringType, DateType, TimestampType, 
DayTimeIntervalType(0,3), DoubleType, ArrayType(LongType,true), BinaryType, 
StructType(StructField(x,LongType,true)), ArrayType(DoubleType,false), 
StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true))
 {code}

> Arrow python UDFS didn't support UDT as output type
> ---------------------------------------------------
>
>                 Key: SPARK-48667
>                 URL: https://issues.apache.org/jira/browse/SPARK-48667
>             Project: Spark
>          Issue Type: Sub-task
>          Components: PySpark
>    Affects Versions: 3.5.1, 3.4.3
>            Reporter: angerszhu
>            Priority: Major
>
> {code:java}
> df.select(udf(lambda x: x, returnType=ExamplePointUDT(), 
> useArrow=useArrow)("point")), {code}
>  
> {code:java}
>  java.lang.AssertionError: assertion failed: Invalid schema from pandas_udf: 
> expected BooleanType, LongType, StringType, StringType, DateType, 
> TimestampType, DayTimeIntervalType(0,3), DoubleType, 
> ArrayType(LongType,true), BinaryType, 
> StructType(StructField(x,LongType,true)), 
> org.apache.spark.sql.test.ExamplePointUDT@49ccc723, 
> StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true)),
>  got BooleanType, LongType, StringType, StringType, DateType, TimestampType, 
> DayTimeIntervalType(0,3), DoubleType, ArrayType(LongType,true), BinaryType, 
> StructType(StructField(x,LongType,true)), ArrayType(DoubleType,false), 
> StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true))
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to