[ 
https://issues.apache.org/jira/browse/SPARK-47891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-47891:
---------------------------------
    Description: 
Improve docstring of mapInPandas
 * "using a Python native function that takes and outputs a pandas DataFrame" 
is confusing cause the function takes and outputs "ITERATOR of pandas 
DataFrames" instead.
 * "All columns are passed together as an iterator of pandas DataFrames" easily 
mislead users to think the entire DataFrame will be passed together, "a batch 
of rows" is used instead.

  was:Improve docstring of mapInPandas


> Improve docstring of mapInPandas
> --------------------------------
>
>                 Key: SPARK-47891
>                 URL: https://issues.apache.org/jira/browse/SPARK-47891
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Documentation, PySpark
>    Affects Versions: 4.0.0
>            Reporter: Xinrong Meng
>            Priority: Major
>              Labels: pull-request-available
>
> Improve docstring of mapInPandas
>  * "using a Python native function that takes and outputs a pandas DataFrame" 
> is confusing cause the function takes and outputs "ITERATOR of pandas 
> DataFrames" instead.
>  * "All columns are passed together as an iterator of pandas DataFrames" 
> easily mislead users to think the entire DataFrame will be passed together, 
> "a batch of rows" is used instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to