HyukjinKwon commented on a change in pull request #32835: URL: https://github.com/apache/spark/pull/32835#discussion_r648794071
########## File path: python/docs/source/user_guide/pandas_on_spark/best_practices.rst ########## @@ -227,19 +227,19 @@ See `Default Index Type <options.rst#default-index-type>`_ for more details abou Reduce the operations on different DataFrame/Series --------------------------------------------------- -Koalas disallows the operations on different DataFrames (or Series) by default to prevent expensive operations. +Pandas APIs on Spark disallows the operations on different DataFrames (or Series) by default to prevent expensive operations. It internally performs a join operation which can be expensive in general, which is discouraged. Whenever possible, this operation should be avoided. See `Operations on different DataFrames <options.rst#operations-on-different-dataframes>`_ for more details. -Use Koalas APIs directly whenever possible +Use pandas APIs on Spark APIs directly whenever possible ------------------------------------------ -Although Koalas has most of the pandas-equivalent APIs, there are several APIs not implemented yet or explicitly unsupported. +Although pandas APIs on Spark has most of the pandas-equivalent APIs, there are several APIs not implemented yet or explicitly unsupported. -As an example, Koalas does not implement ``__iter__()`` to prevent users from collecting all data into the client (driver) side from the whole cluster. +As an example, pandas APIs on Spark does not implement ``__iter__()`` to prevent users from collecting all data into the client (driver) side from the whole cluster. Review comment: ```suggestion As an example, pandas APIs on Spark do not implement ``__iter__()`` to prevent users from collecting all data into the client (driver) side from the whole cluster. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org