Hi all! Here's a part of a Scala recursion that produces a stack overflow after many recursions. I've tried many things but I've not managed to solve it.
val eRDD: RDD[(Int,Int)] = ... val oldRDD: RDD[Int,Int]= ... val result = *Algorithm*(eRDD,oldRDD) *Algorithm*(eRDD: RDD[(Int,Int)] , oldRDD: RDD[(Int,Int)]) : RDD[(Int,Int)]{ val newRDD = *Transformation*(eRDD,oldRDD)//only transformations if(*Compare*(oldRDD,newRDD)) //Compare has the "take" action!! return *Algorithm*(eRDD,newRDD) else return newRDD } The above code is recursive and performs many iterations (until the compare returns false) After some iterations I get a stack overflow error. Probably the lineage chain has become too long. Is there any way to solve this problem? (persist/unpersist, checkpoint, sc.saveAsObjectFile). Note1: Only compare function performs Actions on RDDs Note2: I tried some combinations of persist/unpersist but none of them worked! I tried checkpointing from spark.streaming. I put a checkpoint at every recursion but still received an overflow error I also tried using sc.saveAsObjectFile per iteration and then reading from file (sc.objectFile) during the next iteration. Unfortunately I noticed that the folders are created per iteration are increasing while I was expecting from them to have equal size per iteration. please help!! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Stack-overflow-error-caused-by-long-lineage-RDD-created-after-many-recursions-tp25240.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org