HyukjinKwon commented on a change in pull request #33634:
URL: https://github.com/apache/spark/pull/33634#discussion_r682429156



##########
File path: python/pyspark/pandas/indexes/base.py
##########
@@ -2235,6 +2235,24 @@ def union(
         """
         Form the union of two Index objects.
 
+        .. note:: For duplicated values, pandas chooses the number of 
duplicates of self or other
+            with more duplicates. But counting all duplicates is very 
expensive for large data,
+            so pandas-on-Spark always chooses the number of duplicates in self.

Review comment:
       Is there any release note in pandas?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to