Andrew Grigorev created SPARK-43189: ---------------------------------------
Summary: No overload variant of "pandas_udf" matches argument type "str" Key: SPARK-43189 URL: https://issues.apache.org/jira/browse/SPARK-43189 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.4.0, 3.3.2, 3.2.4 Reporter: Andrew Grigorev h2. Issue Users who have mypy enabled in their IDE or CI environment face very verbose error messages when using the {{pandas_udf}} function in PySpark. The current typing of the {{pandas_udf}} function seems to be causing these issues. As a workaround, the official documentation provides examples that use {{{}# type: ignore[call-overload]{}}}, but this is not an ideal solution. h2. Example Here's a code snippet that triggers the error when mypy is enabled: {code:python} from pyspark.sql.functions import pandas_udf import pandas as pd @pandas_udf("string") def f(s: pd.Series) -> pd.Series: return pd.Series(["a"]*len(s), index=s.index) {code} Running mypy on this code results in a long and verbose error message, which makes it difficult for users to understand the actual issue and how to resolve it. h2. Proposed Solution We kindly request the PySpark development team to review and improve the typing for the {{pandas_udf}} function to prevent these verbose error messages from appearing. This improvement will help users who have mypy enabled in their development environments to have a better experience when using PySpark. Furthermore, we suggest updating the official documentation to provide better examples that do not rely on {{# type: ignore[call-overload]}} to suppress these errors. h2. Impact By addressing this issue, users of PySpark with mypy enabled in their development environment will be able to write and verify their code more efficiently, without being overwhelmed by verbose error messages. This will lead to a more enjoyable and productive experience when working with PySpark and pandas UDFs. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org