zero323 commented on a change in pull request #33399: URL: https://github.com/apache/spark/pull/33399#discussion_r671987962
########## File path: python/pyspark/sql/functions.pyi ########## @@ -359,13 +360,13 @@ def variance(col: ColumnOrName) -> Column: ... @overload def udf( f: Callable[..., Any], returnType: DataTypeOrString = ... -) -> Callable[..., Column]: ... +) -> UserDefinedFunction: ... Review comment: Yeah, this should be addressed, but I am not sure what is the best approach here. We'll definitely need a protocol ‒ at first glance something like this should be sufficient: ```python class UserDefinedFunctionWrapper(Protocol): def __call__(self, *args: Column) -> Column: ... def asNonDeterministic(self) -> UserDefinedFunctionWrapper: ... ``` and shouldn't break any existing annotations, but I haven't tested it thoroughly. @luranhe Thanks for your proposal. Would you mind creating a [JIRA ticket](https://issues.apache.org/jira/projects/SPARK/issues/) and updating the title accordingly (it should be of form `[SPARK-XXXXX][PYTHON] ...`). Thanks in advance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org