[ 
https://issues.apache.org/jira/browse/SPARK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298612#comment-14298612
 ] 

Sean Owen commented on SPARK-5499:
----------------------------------

Ah, that may be right. persist() should also break the lineage, but here you'd 
still be computing the whole lineage all at once from the start before anything 
can persist. Yes, how about checkpoint()?

> iterative computing with 1000 iterations causes stage failure
> -------------------------------------------------------------
>
>                 Key: SPARK-5499
>                 URL: https://issues.apache.org/jira/browse/SPARK-5499
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Tien-Dung LE
>
> I got an error "org.apache.spark.SparkException: Job aborted due to stage 
> failure: Task serialization failed: java.lang.StackOverflowError" when 
> executing an action with 1000 transformations.
> Here is a code snippet to re-produce the error:
> {code}
>   import org.apache.spark.rdd.RDD
>   var pair: RDD[(Long,Long)] = sc.parallelize(Array((1L,2L)))
>     var newPair: RDD[(Long,Long)] = null
>     for (i <- 1 to 1000) {
>       newPair = pair.map(_.swap)
>       pair = newPair
>     }
>     println("Count = " + pair.count())
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to