Vicky Lu created SPARK-55990:
--------------------------------
Summary: [PySpark] Improve iterator UDF single-UDF assertion
messages in worker.py
Key: SPARK-55990
URL: https://issues.apache.org/jira/browse/SPARK-55990
Project: Spark
Issue Type: Improvement
Components: PySpark
Affects Versions: 4.1.1
Reporter: Vicky Lu
### What changes were proposed in this issue?
Improve assertion messages in `python/pyspark/worker.py` for iterator UDF
single-UDF checks.
Currently, the messages are generic (e.g., "One ... UDF expected here.") and do
not show the actual received count.
This issue proposes to make the messages more actionable by including the
actual `num_udfs` value.
### Why are the changes needed?
When iterator UDF execution gets an unexpected number of UDFs, the current
message is not specific enough for debugging.
Including the actual count helps users quickly identify mismatches in UDF setup.
### Scope
This is a message-only improvement for some to-do and does not change runtime
behavior or execution logic.
Affected checks:
- `SQL_MAP_ARROW_ITER_UDF`
- `SQL_SCALAR_PANDAS_ITER_UDF` / `SQL_SCALAR_ARROW_ITER_UDF` (SCALAR_ITER path)
- `SQL_MAP_PANDAS_ITER_UDF`
### Related work
Possibly related to SPARK-55579, but this issue is limited to assertion message
clarity in `worker.py`.
### Test plan
- Run Python syntax check for the updated file.
- Ensure no behavior change besides improved assertion message text.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]