GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/1418
[SPARK-2490] Change recursive visiting on RDD dependencies to iterative approach When performing some transformations on RDDs after many iterations, the dependencies of RDDs could be very long. It can easily cause StackOverflowError when recursively visiting these dependencies in Spark core. For example: var rdd = sc.makeRDD(Array(1)) for (i <- 1 to 1000) { rdd = rdd.coalesce(1).cache() rdd.collect() } This PR changes recursive visiting on rdd's dependencies to iterative approach to avoid StackOverflowError. In addition to the recursive visiting, since the Java serializer has a known [bug](http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4152790) that causes StackOverflowError too when serializing/deserializing a large graph of objects. So applying this PR only solves part of the problem. Using KryoSerializer to replace Java serializer might be helpful. However, since KryoSerializer is not supported for `spark.closure.serializer` now, I can not test if KryoSerializer can solve Java serializer's problem completely. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 remove_recursive_visit Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1418.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1418 ---- commit 900538bbcb61683bf1418534c2466463a630569f Author: Liang-Chi Hsieh <vii...@gmail.com> Date: 2014-07-15T10:58:45Z change recursive visiting on rdd's dependencies to iterative approach to avoid stackoverflowerror. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---