Re: java serialization errors with spark.files.userClassPathFirst=true
after removing all class paramater of class Path from my code, i tried again. different but related eror when i set spark.files.userClassPathFirst=true now i dont even use FileInputFormat directly. HadoopRDD does... 14/05/16 12:17:17 ERROR Executor: Exception in task ID 45 java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/FileInputFormat at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:792) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at org.apache.spark.executor.ChildExecutorURLClassLoader$userClassLoader$.findClass(ExecutorURLClassLoader.scala:42) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.executor.ChildExecutorURLClassLoader.findClass(ExecutorURLClassLoader.scala:51) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:57) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1610) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1481) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1331) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at scala.collection.immutable.$colon$colon.readObject(List.scala:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) On Thu, May 15, 2014 at 3:03 PM, Koert Kuipers ko...@tresata.com wrote: when i set spark.files.userClassPathFirst=true, i get java serialization errors in my tasks, see below. when i set userClassPathFirst back to its default of false, the serialization errors are gone. my spark.serializer is KryoSerializer. the class org.apache.hadoop.fs.Path is in the spark assembly jar, but not in my task jars (the ones i added to the SparkConf). so looks like the ClosureSerializer is having trouble with this class once the ChildExecutorURLClassLoader is used? thats me just guessing. Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:5 failed 4 times, most recent failure: Exception failure in TID 31 on host node05.tresata.com: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/Path java.lang.Class.getDeclaredConstructors0(Native Method) java.lang.Class.privateGetDeclaredConstructors(Class.java:2398) java.lang.Class.getDeclaredConstructors(Class.java:1838) java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass.java:1697) java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:50) java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:203) java.security.AccessController.doPrivileged(Native Method) java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass.java:200) java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:556) java.io.ObjectInputStream.readNonProxyDesc
Re: java serialization errors with spark.files.userClassPathFirst=true
) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at scala.collection.immutable.$colon$colon.readObject(List.scala:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) On Thu, May 15, 2014 at 3:03 PM, Koert Kuipers ko...@tresata.com wrote: when i set spark.files.userClassPathFirst=true, i get java serialization errors in my tasks, see below. when i set userClassPathFirst back to its default of false, the serialization errors are gone. my spark.serializer is KryoSerializer. the class org.apache.hadoop.fs.Path is in the spark assembly jar, but not in my task jars (the ones i added to the SparkConf). so looks like the ClosureSerializer is having trouble with this class once the ChildExecutorURLClassLoader is used? thats me just guessing. Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:5 failed 4 times, most recent failure: Exception failure in TID 31 on host node05.tresata.com: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/Path java.lang.Class.getDeclaredConstructors0(Native Method) java.lang.Class.privateGetDeclaredConstructors(Class.java:2398) java.lang.Class.getDeclaredConstructors(Class.java:1838) java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass.java:1697) java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:50) java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:203) java.security.AccessController.doPrivileged(Native Method) java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass.java:200) java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:556) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1580) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1493) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1729) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) scala.collection.immutable.$colon$colon.readObject(List.scala:362) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1852) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:60) org.apache.spark.scheduler.ShuffleMapTask$.deserializeInfo(ShuffleMapTask.scala:66) org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:139) java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1795) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1754) java.io.ObjectInputStream.readObject0(ObjectInputStream.java
Re: java serialization errors with spark.files.userClassPathFirst=true
/FileInputFormat at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:792) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at org.apache.spark.executor.ChildExecutorURLClassLoader$userClassLoader$.findClass(ExecutorURLClassLoader.scala:42) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.executor.ChildExecutorURLClassLoader.findClass(ExecutorURLClassLoader.scala:51) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:57) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1610) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1481) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1331) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at scala.collection.immutable.$colon$colon.readObject(List.scala:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) On Thu, May 15, 2014 at 3:03 PM, Koert Kuipers ko...@tresata.comwrote: when i set spark.files.userClassPathFirst=true, i get java serialization errors in my tasks, see below. when i set userClassPathFirst back to its default of false, the serialization errors are gone. my spark.serializer is KryoSerializer. the class org.apache.hadoop.fs.Path is in the spark assembly jar, but not in my task jars (the ones i added to the SparkConf). so looks like the ClosureSerializer is having trouble with this class once the ChildExecutorURLClassLoader is used? thats me just guessing. Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:5 failed 4 times, most recent failure: Exception failure in TID 31 on host node05.tresata.com: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/Path java.lang.Class.getDeclaredConstructors0(Native Method) java.lang.Class.privateGetDeclaredConstructors(Class.java:2398) java.lang.Class.getDeclaredConstructors(Class.java:1838) java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass.java:1697) java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:50) java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:203) java.security.AccessController.doPrivileged(Native Method) java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass.java:200) java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:556) java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1580) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1493) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1729) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) java.io.ObjectInputStream.defaultReadFields