Vicky Lu created SPARK-55990:
--------------------------------

             Summary: [PySpark] Improve iterator UDF single-UDF assertion 
messages in worker.py
                 Key: SPARK-55990
                 URL: https://issues.apache.org/jira/browse/SPARK-55990
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 4.1.1
            Reporter: Vicky Lu


### What changes were proposed in this issue?

Improve assertion messages in `python/pyspark/worker.py` for iterator UDF 
single-UDF checks.
Currently, the messages are generic (e.g., "One ... UDF expected here.") and do 
not show the actual received count.

This issue proposes to make the messages more actionable by including the 
actual `num_udfs` value.

### Why are the changes needed?

When iterator UDF execution gets an unexpected number of UDFs, the current 
message is not specific enough for debugging.
Including the actual count helps users quickly identify mismatches in UDF setup.

### Scope

This is a message-only improvement for some to-do and does not change runtime 
behavior or execution logic.

Affected checks:
- `SQL_MAP_ARROW_ITER_UDF`
- `SQL_SCALAR_PANDAS_ITER_UDF` / `SQL_SCALAR_ARROW_ITER_UDF` (SCALAR_ITER path)
- `SQL_MAP_PANDAS_ITER_UDF`

### Related work

Possibly related to SPARK-55579, but this issue is limited to assertion message 
clarity in `worker.py`.

### Test plan

- Run Python syntax check for the updated file.
- Ensure no behavior change besides improved assertion message text.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to