Re: spark on yarn-standalone, throws StackOverflowError and fails somtimes and succeed for the rest
Could you try `println(result.toDebugString())` right after `val result = ...` and attach the result? -Xiangrui On Fri, May 9, 2014 at 8:20 AM, phoenix bai mingzhi...@gmail.com wrote: after a couple of tests, I find that, if I use: val result = model.predict(prdctpairs) result.map(x = x.user+,+x.product+,+x.rating).saveAsTextFile(output) it always fails with above error and the exception seems iterative. but if I do: val result = model.predict(prdctpairs) result.cach() result.map(x = x.user+,+x.product+,+x.rating).saveAsTextFile(output) it succeeds. could anyone help explain why the cach() is necessary? thanks On Fri, May 9, 2014 at 6:45 PM, phoenix bai mingzhi...@gmail.com wrote: Hi all, My spark code is running on yarn-standalone. the last three lines of the code as below, val result = model.predict(prdctpairs) result.map(x = x.user+,+x.product+,+x.rating).saveAsTextFile(output) sc.stop() the same code, sometimes be able to run successfully and could give out the right result, while from time to time, it throws StackOverflowError and fail. and I don`t have a clue how I should debug. below is the error, (the start and end portion to be exact): 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 44 to sp...@rxx43.mc10.site.net:43885 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] MapOutputTrackerMaster: Size of output statuses for shuffle 44 is 148 bytes 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 45 to sp...@rxx43.mc10.site.net:43885 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35] MapOutputTrackerMaster: Size of output statuses for shuffle 45 is 453 bytes 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-20] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 44 to sp...@rxx43.mc10.site.net:56767 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 45 to sp...@rxx43.mc10.site.net:56767 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 44 to sp...@rxx43.mc10.site.net:49879 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 45 to sp...@rxx43.mc10.site.net:49879 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] TaskSetManager: Starting task 946.0:17 as TID 146 on executor 6: rx15.mc10.site.net (PROCESS_LOCAL) 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] TaskSetManager: Serialized task 946.0:17 as 6414 bytes in 0 ms 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Lost TID 133 (task 946.0:4) 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Loss was due to java.lang.StackOverflowError java.lang.StackOverflowError at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at
spark on yarn-standalone, throws StackOverflowError and fails somtimes and succeed for the rest
Hi all, My spark code is running on yarn-standalone. the last three lines of the code as below, val result = model.predict(prdctpairs) result.map(x = x.user+,+x.product+,+x.rating).saveAsTextFile(output) sc.stop() the same code, sometimes be able to run successfully and could give out the right result, while from time to time, it throws StackOverflowError and fail. and I don`t have a clue how I should debug. below is the error, (the start and end portion to be exact): 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 44 to sp...@rxx43.mc10.site.net:43885 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] MapOutputTrackerMaster: Size of output statuses for shuffle 44 is 148 bytes 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 45 to sp...@rxx43.mc10.site.net:43885 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35] MapOutputTrackerMaster: Size of output statuses for shuffle 45 is 453 bytes 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-20] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 44 to sp...@rxx43.mc10.site.net:56767 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 45 to sp...@rxx43.mc10.site.net:56767 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 44 to sp...@rxx43.mc10.site.net:49879 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29] MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 45 to sp...@rxx43.mc10.site.net:49879 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] TaskSetManager: Starting task 946.0:17 as TID 146 on executor 6: rx15.mc10.site.net (PROCESS_LOCAL) 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17] TaskSetManager: Serialized task 946.0:17 as 6414 bytes in 0 ms 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Lost TID 133 (task 946.0:4) 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Loss was due to java.lang.StackOverflowError java.lang.StackOverflowError at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] TaskSetManager: Starting task 946.0:4 as TID 147 on executor 6: r15.mc10.site.net (PROCESS_LOCAL) 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] TaskSetManager: Serialized task 946.0:4 as 6414 bytes in 0 ms 14-05-09 17:55:51 WARN [Result resolver thread-1] TaskSetManager: Lost TID 139 (task 946.0:10) 14-05-09 17:55:51 INFO [Result resolver thread-1] TaskSetManager: Loss was due to java.lang.StackOverflowError [duplicate 1] 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5] CoarseGrainedSchedulerBackend: Executor 4 disconnected, so removing it 14-05-09 17:55:51 ERROR