This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.3 by this push: new e4bb341d376 Revert "[SPARK-34827][PYTHON][DOC] Remove outdated statements on distributed-sequence default index" e4bb341d376 is described below commit e4bb341d37661e93097e56e0087699bca60825fb Author: Hyukjin Kwon <gurwls...@apache.org> AuthorDate: Thu May 12 09:12:13 2022 +0900 Revert "[SPARK-34827][PYTHON][DOC] Remove outdated statements on distributed-sequence default index" This reverts commit f75c00da3cf01e63d93cedbe480198413af41455. --- python/docs/source/user_guide/pandas_on_spark/options.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/python/docs/source/user_guide/pandas_on_spark/options.rst b/python/docs/source/user_guide/pandas_on_spark/options.rst index 67b8f6841f5..c0d9b18c085 100644 --- a/python/docs/source/user_guide/pandas_on_spark/options.rst +++ b/python/docs/source/user_guide/pandas_on_spark/options.rst @@ -186,7 +186,9 @@ This is conceptually equivalent to the PySpark example as below: **distributed-sequence** (default): It implements a sequence that increases one by one, by group-by and group-map approach in a distributed manner. It still generates the sequential index globally. If the default index must be the sequence in a large dataset, this -index has to be used. See the example below: +index has to be used. +Note that if more data are added to the data source after creating this index, +then it does not guarantee the sequential index. See the example below: .. code-block:: python --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org