HyukjinKwon commented on a change in pull request #32835: URL: https://github.com/apache/spark/pull/32835#discussion_r648800130
########## File path: python/docs/source/user_guide/pandas_on_spark/typehints.rst ########## @@ -1,36 +1,36 @@ -==================== -Type Hints In Koalas -==================== +================================== +Type Hints In pandas APIs on Spark +================================== .. currentmodule:: pyspark.pandas -Koalas, by default, infers the schema by taking some top records from the output, -in particular, when you use APIs that allow users to apply a function against Koalas DataFrame +Pandas APIs on Spark, by default, infers the schema by taking some top records from the output, +in particular, when you use APIs that allow users to apply a function against pandas APIs on Spark DataFrame such as :func:`DataFrame.transform`, :func:`DataFrame.apply`, :func:`DataFrame.koalas.apply_batch`, :func:`DataFrame.koalas.apply_batch`, :func:`Series.koalas.apply_batch`, etc. However, this is potentially expensive. If there are several expensive operations such as a shuffle -in the upstream of the execution plan, Koalas will end up with executing the Spark job twice, once +in the upstream of the execution plan, pandas APIs on Spark will end up with executing the Spark job twice, once for schema inference, and once for processing actual data with the schema. -To avoid the consequences, Koalas has its own type hinting style to specify the schema to avoid -schema inference. Koalas understands the type hints specified in the return type and converts it +To avoid the consequences, pandas APIs on Spark has its own type hinting style to specify the schema to avoid Review comment: ```suggestion To avoid the consequences, pandas APIs on Spark have its own type hinting style to specify the schema to avoid ``` ########## File path: python/docs/source/user_guide/pandas_on_spark/typehints.rst ########## @@ -1,36 +1,36 @@ -==================== -Type Hints In Koalas -==================== +================================== +Type Hints In pandas APIs on Spark +================================== .. currentmodule:: pyspark.pandas -Koalas, by default, infers the schema by taking some top records from the output, -in particular, when you use APIs that allow users to apply a function against Koalas DataFrame +Pandas APIs on Spark, by default, infers the schema by taking some top records from the output, +in particular, when you use APIs that allow users to apply a function against pandas APIs on Spark DataFrame such as :func:`DataFrame.transform`, :func:`DataFrame.apply`, :func:`DataFrame.koalas.apply_batch`, :func:`DataFrame.koalas.apply_batch`, :func:`Series.koalas.apply_batch`, etc. However, this is potentially expensive. If there are several expensive operations such as a shuffle -in the upstream of the execution plan, Koalas will end up with executing the Spark job twice, once +in the upstream of the execution plan, pandas APIs on Spark will end up with executing the Spark job twice, once for schema inference, and once for processing actual data with the schema. -To avoid the consequences, Koalas has its own type hinting style to specify the schema to avoid -schema inference. Koalas understands the type hints specified in the return type and converts it +To avoid the consequences, pandas APIs on Spark has its own type hinting style to specify the schema to avoid +schema inference. Pandas APIs on Spark understands the type hints specified in the return type and converts it Review comment: ```suggestion schema inference. Pandas APIs on Spark understand the type hints specified in the return type and converts it ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org