haiyangsun-db commented on code in PR #55611:
URL: https://github.com/apache/spark/pull/55611#discussion_r3192202608
##########
sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala:
##########
@@ -1461,14 +1463,36 @@ class Dataset[T] private[sql](
isBarrier: Boolean = false,
profile: ResourceProfile = null): DataFrame = {
val func = funcCol.expr
- Dataset.ofRows(
- sparkSession,
+ val output = toAttributes(func.dataType.asInstanceOf[StructType])
Review Comment:
mapInArrow (below) can also reuse the code here. The only different thing is
the eval type.
This reminds us that the eval type needs to be wired into the
ExternalUserDefinedFunction.
Currently, the proto explicitly supports eval type as part of UdfPayload:
https://github.com/apache/spark/pull/55657/changes#diff-9df5dd8319af6bd7238d49a6fd8f082e5b46cb92758dfe48ec35f1b085b16ef1R410
We should introduce an explicit eval_type (optional) to
ExternalUserDefinedFunction - when payload itself is not enough, eval_type
should be set.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]