Yicong-Huang commented on code in PR #55530:
URL: https://github.com/apache/spark/pull/55530#discussion_r3270703811
##########
python/pyspark/sql/conversion.py:
##########
@@ -145,11 +146,26 @@ def enforce_schema(
If False, raise an error on type mismatch instead of casting.
safecheck : bool, default True
If True, use safe casting (fails on overflow/truncation).
+ reorder_by_name : bool, default True
+ If True, match columns by name and reorder to the target order; any
+ missing or extra names raise ``RESULT_COLUMN_NAMES_MISMATCH``.
Output
+ columns are renamed to target names.
+ If False, match columns by position (ignore names) and preserve the
+ original column names in the output.
Returns
-------
- pa.RecordBatch
- RecordBatch with columns reordered and types coerced to match
target schema.
+ pa.RecordBatch or pa.Table
+ Same container type as ``batch``, with columns matched (and
possibly
+ reordered/cast) per the target schema.
+
+ Raises
+ ------
+ PySparkRuntimeError
+ ``RESULT_COLUMN_NAMES_MISMATCH`` when ``reorder_by_name=True`` and
the
+ batch has missing or extra column names.
+ ``RESULT_COLUMN_TYPES_MISMATCH`` when any column's type does not
match
+ the target (and either ``arrow_cast=False`` or the cast itself
fails).
Review Comment:
added doc
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]