Nicholas Chammas created SPARK-5865:
---------------------------------------

             Summary: Add doc warnings for methods that collect an RDD to the 
driver
                 Key: SPARK-5865
                 URL: https://issues.apache.org/jira/browse/SPARK-5865
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core, SQL
            Reporter: Nicholas Chammas
            Priority: Minor


We should include a note in the doc string for any method that collects an RDD 
to the driver so that users have some hint of why their call might be OOMing.

{{RDD.collect()}}
* 
[Scala|https://github.com/apache/spark/blob/d8adefefcc2a4af32295440ed1d4917a6968f017/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L803-L806]
* 
[Python|https://github.com/apache/spark/blob/d8adefefcc2a4af32295440ed1d4917a6968f017/python/pyspark/rdd.py#L680-L683]

{{DataFrame.toPandas()}}
* 
[Python|https://github.com/apache/spark/blob/c76da36c2163276b5c34e59fbb139eeb34ed0faa/python/pyspark/sql/dataframe.py#L637-L645]

{{Column.toPandas()}}
* 
[Python|https://github.com/apache/spark/blob/c76da36c2163276b5c34e59fbb139eeb34ed0faa/python/pyspark/sql/dataframe.py#L965-L973]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to