Could you try `println(result.toDebugString())` right after `val result = ...` and attach the result? -Xiangrui
On Fri, May 9, 2014 at 8:20 AM, phoenix bai <mingzhi...@gmail.com> wrote: > after a couple of tests, I find that, if I use: > > val result = model.predict(prdctpairs) > result.map(x => > x.user+","+x.product+","+x.rating).saveAsTextFile(output) > > it always fails with above error and the exception seems iterative. > > but if I do: > > val result = model.predict(prdctpairs) > result.cach() > result.map(x => > x.user+","+x.product+","+x.rating).saveAsTextFile(output) > > it succeeds. > > could anyone help explain why the cach() is necessary? > > thanks > > > > On Fri, May 9, 2014 at 6:45 PM, phoenix bai <mingzhi...@gmail.com> wrote: >> >> Hi all, >> >> My spark code is running on yarn-standalone. >> >> the last three lines of the code as below, >> >> val result = model.predict(prdctpairs) >> result.map(x => >> x.user+","+x.product+","+x.rating).saveAsTextFile(output) >> sc.stop() >> >> the same code, sometimes be able to run successfully and could give out >> the right result, while from time to time, it throws StackOverflowError and >> fail. >> >> and I don`t have a clue how I should debug. >> >> below is the error, (the start and end portion to be exact): >> >> >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] >> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle >> 44 to sp...@rxxxxxx43.mc10.site.net:43885 >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] >> MapOutputTrackerMaster: Size of output statuses for shuffle 44 is 148 bytes >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35] >> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle >> 45 to sp...@rxxxxxx43.mc10.site.net:43885 >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35] >> MapOutputTrackerMaster: Size of output statuses for shuffle 45 is 453 bytes >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-20] >> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle >> 44 to sp...@rxxxxxx43.mc10.site.net:56767 >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] >> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle >> 45 to sp...@rxxxxxx43.mc10.site.net:56767 >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] >> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle >> 44 to sp...@rxxxxxx43.mc10.site.net:49879 >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] >> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle >> 45 to sp...@rxxxxxx43.mc10.site.net:49879 >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] >> TaskSetManager: Starting task 946.0:17 as TID 146 on executor 6: >> rxxxxx15.mc10.site.net (PROCESS_LOCAL) >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] >> TaskSetManager: Serialized task 946.0:17 as 6414 bytes in 0 ms >> 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Lost TID >> 133 (task 946.0:4) >> 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Loss was >> due to java.lang.StackOverflowError >> java.lang.StackOverflowError >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) >> at java.lang.ClassLoader.defineClass(ClassLoader.java:615) >> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) >> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) >> at java.net.URLClassLoader.access$000(URLClassLoader.java:58) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:197) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) >> at java.lang.ClassLoader.defineClass(ClassLoader.java:615) >> >> ............................................ >> >> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] >> TaskSetManager: Starting task 946.0:4 as TID 147 on executor 6: >> rxxxx15.mc10.site.net (PROCESS_LOCAL) >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] >> TaskSetManager: Serialized task 946.0:4 as 6414 bytes in 0 ms >> 14-05-09 17:55:51 WARN [Result resolver thread-1] TaskSetManager: Lost TID >> 139 (task 946.0:10) >> 14-05-09 17:55:51 INFO [Result resolver thread-1] TaskSetManager: Loss was >> due to java.lang.StackOverflowError [duplicate 1] >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] >> CoarseGrainedSchedulerBackend: Executor 4 disconnected, so removing it >> 14-05-09 17:55:51 ERROR [spark-akka.actor.default-dispatcher-5] >> YarnClusterScheduler: Lost executor 4 on rxxxxx01.mc10.site.net: remote Akka >> client disassociated >> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] >> TaskSetManager: Re-queueing tasks for 4 from TaskSet 992.0 >> >> did anyone have a similar issue? >> Or anyone could provide a clue about where I should start looking? >> >> thanks in advance! >> >> >> >> >> >