If we do cache() + count() after say every 50 iterations. The whole process becomes very slow. I have tried checkpoint() , cache() + count(), saveAsObjectFiles(). Nothing works. Materializing RDD's lead to drastic decrease in performance & if we don't materialize, we face stackoverflowerror.
On Wed, May 14, 2014 at 10:25 AM, Nick Chammas [via Apache Spark User List] <ml-node+s1001560n5683...@n3.nabble.com> wrote: > Would cache() + count() every N iterations work just as well as > checkPoint() + count() to get around this issue? > > We're basically trying to get Spark to avoid working on too lengthy a > lineage at once, right? > > Nick > > > On Tue, May 13, 2014 at 12:04 PM, Xiangrui Meng <[hidden > email]<http://user/SendEmail.jtp?type=node&node=5683&i=0> > > wrote: > >> After checkPoint, call count directly to materialize it. -Xiangrui >> >> On Tue, May 13, 2014 at 4:20 AM, Mayur Rustagi <[hidden >> email]<http://user/SendEmail.jtp?type=node&node=5683&i=1>> >> wrote: >> > We are running into same issue. After 700 or so files the stack >> overflows, >> > cache, persist & checkpointing dont help. >> > Basically checkpointing only saves the RDD when it is materialized & it >> only >> > materializes in the end, then it runs out of stack. >> > >> > Regards >> > Mayur >> > >> > Mayur Rustagi >> > Ph: <a href="tel:%2B1%20%28760%29%20203%203257" value="+17602033257">+1 >> (760) 203 3257 >> > http://www.sigmoidanalytics.com >> > @mayur_rustagi >> >> > >> > >> > >> > On Tue, May 13, 2014 at 11:40 AM, Xiangrui Meng <[hidden >> > email]<http://user/SendEmail.jtp?type=node&node=5683&i=2>> >> wrote: >> >> >> >> You have a long lineage that causes the StackOverflow error. Try >> >> rdd.checkPoint() and rdd.count() for every 20~30 iterations. >> >> checkPoint can cut the lineage. -Xiangrui >> >> >> >> On Mon, May 12, 2014 at 3:42 PM, Guanhua Yan <[hidden >> >> email]<http://user/SendEmail.jtp?type=node&node=5683&i=3>> >> wrote: >> >> > Dear Sparkers: >> >> > >> >> > I am using Python spark of version 0.9.0 to implement some iterative >> >> > algorithm. I got some errors shown at the end of this email. It seems >> >> > that >> >> > it's due to the Java Stack Overflow error. The same error has been >> >> > duplicated on a mac desktop and a linux workstation, both running the >> >> > same >> >> > version of Spark. >> >> > >> >> > The same line of code works correctly after quite some iterations. At >> >> > the >> >> > line of error, rdd__new.count() could be 0. (In some previous rounds, >> >> > this >> >> > was also 0 without any problem). >> >> > >> >> > Any thoughts on this? >> >> > >> >> > Thank you very much, >> >> > - Guanhua >> >> > >> >> > >> >> > ======================================== >> >> > CODE: print "round", round, rdd__new.count() >> >> > ======================================== >> >> > File >> >> > >> >> > >> "/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/rdd.py", >> >> > line 542, in count >> >> > 14/05/12 16:20:28 INFO TaskSetManager: Loss was due to >> >> > java.lang.StackOverflowError [duplicate 1] >> >> > return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum() >> >> > 14/05/12 16:20:28 ERROR TaskSetManager: Task 8419.0:0 failed 1 times; >> >> > aborting job >> >> > File >> >> > >> >> > >> "/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/rdd.py", >> >> > line 533, in sum >> >> > 14/05/12 16:20:28 INFO TaskSchedulerImpl: Ignoring update with state >> >> > FAILED >> >> > from TID 1774 because its task set is gone >> >> > return self.mapPartitions(lambda x: >> [sum(x)]).reduce(operator.add) >> >> > File >> >> > >> >> > >> "/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/rdd.py", >> >> > line 499, in reduce >> >> > vals = self.mapPartitions(func).collect() >> >> > File >> >> > >> >> > >> "/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/rdd.py", >> >> > line 463, in collect >> >> > bytesInJava = self._jrdd.collect().iterator() >> >> > File >> >> > >> >> > >> "/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", >> >> > line 537, in __call__ >> >> > File >> >> > >> >> > >> "/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", >> >> > line 300, in get_return_value >> >> > py4j.protocol.Py4JJavaError: An error occurred while calling >> >> > o4317.collect. >> >> > : org.apache.spark.SparkException: Job aborted: Task 8419.0:1 failed >> 1 >> >> > times >> >> > (most recent failure: Exception failure: >> java.lang.StackOverflowError) >> >> > at >> >> > >> >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) >> >> > at >> >> > >> >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026) >> >> > at >> >> > >> >> > >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> >> > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> >> > at >> >> > >> >> > org.apache.spark.scheduler.DAGScheduler.org >> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1026) >> >> > at >> >> > >> >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619) >> >> > at >> >> > >> >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619) >> >> > at scala.Option.foreach(Option.scala:236) >> >> > at >> >> > >> >> > >> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619) >> >> > at >> >> > >> >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207) >> >> > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) >> >> > at akka.actor.ActorCell.invoke(ActorCell.scala:456) >> >> > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) >> >> > at akka.dispatch.Mailbox.run(Mailbox.scala:219) >> >> > at >> >> > >> >> > >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >> >> > at >> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >> >> > at >> >> > >> >> > >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >> >> > at >> >> > >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >> >> > at >> >> > >> >> > >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >> >> > >> >> > ====================================== >> >> > The stack overflow error is shown as follows: >> >> > ====================================== >> >> > >> >> > 14/05/12 16:20:28 ERROR Executor: Exception in task ID 1774 >> >> > java.lang.StackOverflowError >> >> > at java.util.zip.Inflater.inflate(Inflater.java:259) >> >> > at >> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152) >> >> > at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116) >> >> > at >> >> > >> >> > >> java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2310) >> >> > at >> >> > >> >> > >> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323) >> >> > at >> >> > >> >> > >> java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:2818) >> >> > at java.io.ObjectInputStream.readHandle(ObjectInputStream.java:1452) >> >> > at >> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1511) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> >> > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) >> >> > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) >> >> > at >> >> > >> >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> > at java.lang.reflect.Method.invoke(Method.java:606) >> >> > at >> >> > >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> >> > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) >> >> > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) >> >> > at >> >> > >> >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> > at java.lang.reflect.Method.invoke(Method.java:606) >> >> > at >> >> > >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at >> >> > >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> >> > at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> > at >> >> > >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> >> > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) >> >> > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) >> >> > The above replicated many times after this … >> >> > ====================================== >> > >> > >> > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-StackOverflowError-when-calling-count-tp5649p5683.html > To start a new topic under Apache Spark User List, email > ml-node+s1001560n...@n3.nabble.com > To unsubscribe from Apache Spark User List, click > here<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=bGFsaXRAc2lnbW9pZGFuYWx5dGljcy5jb218MXwtMTIwMzUwMjA2MQ==> > . > NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- -- Thanks & Regards, Lalit Yadav +91-9901007692 ----- Lalit Yadav la...@sigmoidanalytics.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-StackOverflowError-when-calling-count-tp5649p5698.html Sent from the Apache Spark User List mailing list archive at Nabble.com.