This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-2.4 by this push: new f378c7f [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields f378c7f is described below commit f378c7fba29368ca32142a3b7fc169dabe6cb37f Author: Liang-Chi Hsieh <vii...@gmail.com> AuthorDate: Mon Mar 9 11:06:45 2020 -0700 [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields ### What changes were proposed in this pull request? Adding a note to document `Row.asDict` behavior when there are duplicate fields. ### Why are the changes needed? When a row contains duplicate fields, `asDict` and `_get_item_` behaves differently. We should document it to let users know the difference explicitly. ### Does this PR introduce any user-facing change? No. Only document change. ### How was this patch tested? Existing test. Closes #27853 from viirya/SPARK-30941. Authored-by: Liang-Chi Hsieh <vii...@gmail.com> Signed-off-by: Dongjoon Hyun <dongj...@apache.org> (cherry picked from commit d21aab403a0a32e8b705b38874c0b335e703bd5d) Signed-off-by: Dongjoon Hyun <dongj...@apache.org> --- python/pyspark/sql/types.py | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py index 1d24c40..0d73963 100644 --- a/python/pyspark/sql/types.py +++ b/python/pyspark/sql/types.py @@ -1466,6 +1466,12 @@ class Row(tuple): :param recursive: turns the nested Row as dict (default: False). + .. note:: If a row contains duplicate field names, e.g., the rows of a join + between two :class:`DataFrame` that both have the fields of same names, + one of the duplicate fields will be selected by ``asDict``. ``__getitem__`` + will also return one of the duplicate fields, however returned value might + be different to ``asDict``. + >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11} True >>> row = Row(key=1, value=Row(name='a', age=2)) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org