[GitHub] [spark] HyukjinKwon commented on a change in pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

GitBox Wed, 09 Jun 2021 18:53:12 -0700


HyukjinKwon commented on a change in pull request #32835:
URL: https://github.com/apache/spark/pull/32835#discussion_r648794071




##########
File path: python/docs/source/user_guide/pandas_on_spark/best_practices.rst
##########
@@ -227,19 +227,19 @@ See `Default Index Type 
<options.rst#default-index-type>`_ for more details abou
 Reduce the operations on different DataFrame/Series
 ---------------------------------------------------
 
-Koalas disallows the operations on different DataFrames (or Series) by default 
to prevent expensive operations.
+Pandas APIs on Spark disallows the operations on different DataFrames (or 
Series) by default to prevent expensive operations.
 It internally performs a join operation which can be expensive in general, 
which is discouraged. Whenever possible,
 this operation should be avoided.
 
 See `Operations on different DataFrames 
<options.rst#operations-on-different-dataframes>`_ for more details.
 
 
-Use Koalas APIs directly whenever possible
+Use pandas APIs on Spark APIs directly whenever possible
 ------------------------------------------
 
-Although Koalas has most of the pandas-equivalent APIs, there are several APIs 
not implemented yet or explicitly unsupported.
+Although pandas APIs on Spark has most of the pandas-equivalent APIs, there 
are several APIs not implemented yet or explicitly unsupported.
 
-As an example, Koalas does not implement ``__iter__()`` to prevent users from 
collecting all data into the client (driver) side from the whole cluster.
+As an example, pandas APIs on Spark does not implement ``__iter__()`` to 
prevent users from collecting all data into the client (driver) side from the 
whole cluster.

Review comment:
       ```suggestion
   As an example, pandas APIs on Spark do not implement ``__iter__()`` to 
prevent users from collecting all data into the client (driver) side from the 
whole cluster.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

Reply via email to