Repository: spark
Updated Branches:
  refs/heads/branch-1.4 5b468cf0c -> e33c0f0a4


[SPARK-8769] [TRIVIAL] [DOCS] toLocalIterator should mention it results in many 
jobs

Author: Holden Karau <hol...@pigscanfly.ca>

Closes #7171 from holdenk/SPARK-8769-toLocalIterator-documentation-improvement 
and squashes the following commits:

97ddd99 [Holden Karau] Add note

(cherry picked from commit 15d41cc501f5fa7ac82c4a6741e416bb557f610a)
Signed-off-by: Andrew Or <and...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e33c0f0a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e33c0f0a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e33c0f0a

Branch: refs/heads/branch-1.4
Commit: e33c0f0a497194d93b3c034502a9a49dc22c0cdf
Parents: 5b468cf
Author: Holden Karau <hol...@pigscanfly.ca>
Authored: Wed Jul 1 23:05:45 2015 -0700
Committer: Andrew Or <and...@databricks.com>
Committed: Wed Jul 1 23:05:57 2015 -0700

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/rdd/RDD.scala | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/e33c0f0a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 10610f4..cac6e3b 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -890,6 +890,10 @@ abstract class RDD[T: ClassTag](
    * Return an iterator that contains all of the elements in this RDD.
    *
    * The iterator will consume as much memory as the largest partition in this 
RDD.
+   *
+   * Note: this results in multiple Spark jobs, and if the input RDD is the 
result
+   * of a wide transformation (e.g. join with different partitioners), to avoid
+   * recomputing the input RDD should be cached first.
    */
   def toLocalIterator: Iterator[T] = withScope {
     def collectPartition(p: Int): Array[T] = {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to