[ https://issues.apache.org/jira/browse/SPARK-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027808#comment-14027808 ]
Mridul Muralidharan commented on SPARK-2018: -------------------------------------------- Ah ! This is an interesting bug. Default spark uses java serialization ... so should not be an issue : but yet you are facing it ! (I am assuming you have not customized serialization). Is it possible for you to dump data written and read at both ends ? The env vars and jvm details ? Actually, spark does not do anything fancy for default serialization : so a simple example code without spark in picture could also be tried (write to file on master node, and read from the file in slave node - and see if it works) > Big-Endian (IBM Power7) Spark Serialization issue > -------------------------------------------------- > > Key: SPARK-2018 > URL: https://issues.apache.org/jira/browse/SPARK-2018 > Project: Spark > Issue Type: Bug > Affects Versions: 1.0.0 > Environment: hardware : IBM Power7 > OS:Linux version 2.6.32-358.el6.ppc64 > (mockbu...@ppc-017.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red > Hat 4.4.7-3) (GCC) ) #1 SMP Tue Jan 29 11:43:27 EST 2013 > JDK: Java(TM) SE Runtime Environment (build pxp6470sr5-20130619_01(SR5)) > IBM J9 VM (build 2.6, JRE 1.7.0 Linux ppc64-64 Compressed References > 20130617_152572 (JIT enabled, AOT enabled) > Hadoop:Hadoop-0.2.3-CDH5.0 > Spark:Spark-1.0.0 or Spark-0.9.1 > spark-env.sh: > export JAVA_HOME=/opt/ibm/java-ppc64-70/ > export SPARK_MASTER_IP=9.114.34.69 > export SPARK_WORKER_MEMORY=10000m > export SPARK_CLASSPATH=/home/test1/spark-1.0.0-bin-hadoop2/lib > export STANDALONE_SPARK_MASTER_HOST=9.114.34.69 > #export SPARK_JAVA_OPTS=' -Xdebug > -Xrunjdwp:transport=dt_socket,address=99999,server=y,suspend=n ' > Reporter: Yanjie Gao > > We have an application run on Spark on Power7 System . > But we meet an important issue about serialization. > The example HdfsWordCount can meet the problem. > ./bin/run-example org.apache.spark.examples.streaming.HdfsWordCount > localdir > We used Power7 (Big-Endian arch) and Redhat 6.4. > Big-Endian is the main cause since the example ran successfully in another > Power-based Little Endian setup. > here is the exception stack and log: > Spark Executor Command: "/opt/ibm/java-ppc64-70//bin/java" "-cp" > "/home/test1/spark-1.0.0-bin-hadoop2/lib::/home/test1/src/spark-1.0.0-bin-hadoop2/conf:/home/test1/src/spark-1.0.0-bin-hadoop2/lib/spark-assembly-1.0.0-hadoop2.2.0.jar:/home/test1/src/spark-1.0.0-bin-hadoop2/lib/datanucleus-rdbms-3.2.1.jar:/home/test1/src/spark-1.0.0-bin-hadoop2/lib/datanucleus-api-jdo-3.2.1.jar:/home/test1/src/spark-1.0.0-bin-hadoop2/lib/datanucleus-core-3.2.2.jar:/home/test1/src/hadoop-2.3.0-cdh5.0.0/etc/hadoop/:/home/test1/src/hadoop-2.3.0-cdh5.0.0/etc/hadoop/" > "-XX:MaxPermSize=128m" "-Xdebug" > "-Xrunjdwp:transport=dt_socket,address=99999,server=y,suspend=n" "-Xms512M" > "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" > "akka.tcp://spark@9.186.105.141:60253/user/CoarseGrainedScheduler" "2" > "p7hvs7br16" "4" "akka.tcp://sparkWorker@p7hvs7br16:59240/user/Worker" > "app-20140604023054-0000" > ======================================== > 14/06/04 02:31:20 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 14/06/04 02:31:21 INFO spark.SecurityManager: Changing view acls to: > test1,yifeng > 14/06/04 02:31:21 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(test1, yifeng) > 14/06/04 02:31:22 INFO slf4j.Slf4jLogger: Slf4jLogger started > 14/06/04 02:31:22 INFO Remoting: Starting remoting > 14/06/04 02:31:22 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkExecutor@p7hvs7br16:39658] > 14/06/04 02:31:22 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://sparkExecutor@p7hvs7br16:39658] > 14/06/04 02:31:22 INFO executor.CoarseGrainedExecutorBackend: Connecting to > driver: akka.tcp://spark@9.186.105.141:60253/user/CoarseGrainedScheduler > 14/06/04 02:31:22 INFO worker.WorkerWatcher: Connecting to worker > akka.tcp://sparkWorker@p7hvs7br16:59240/user/Worker > 14/06/04 02:31:23 INFO worker.WorkerWatcher: Successfully connected to > akka.tcp://sparkWorker@p7hvs7br16:59240/user/Worker > 14/06/04 02:31:24 INFO executor.CoarseGrainedExecutorBackend: Successfully > registered with driver > 14/06/04 02:31:24 INFO spark.SecurityManager: Changing view acls to: > test1,yifeng > 14/06/04 02:31:24 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(test1, yifeng) > 14/06/04 02:31:24 INFO slf4j.Slf4jLogger: Slf4jLogger started > 14/06/04 02:31:24 INFO Remoting: Starting remoting > 14/06/04 02:31:24 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://spark@p7hvs7br16:58990] > 14/06/04 02:31:24 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://spark@p7hvs7br16:58990] > 14/06/04 02:31:24 INFO spark.SparkEnv: Connecting to MapOutputTracker: > akka.tcp://spark@9.186.105.141:60253/user/MapOutputTracker > 14/06/04 02:31:25 INFO spark.SparkEnv: Connecting to BlockManagerMaster: > akka.tcp://spark@9.186.105.141:60253/user/BlockManagerMaster > 14/06/04 02:31:25 INFO storage.DiskBlockManager: Created local directory at > /tmp/spark-local-20140604023125-3f61 > 14/06/04 02:31:25 INFO storage.MemoryStore: MemoryStore started with capacity > 307.2 MB. > 14/06/04 02:31:25 INFO network.ConnectionManager: Bound socket to port 39041 > with id = ConnectionManagerId(p7hvs7br16,39041) > 14/06/04 02:31:25 INFO storage.BlockManagerMaster: Trying to register > BlockManager > 14/06/04 02:31:25 INFO storage.BlockManagerMaster: Registered BlockManager > 14/06/04 02:31:25 INFO spark.HttpFileServer: HTTP File server directory is > /tmp/spark-7bce4e43-2833-4666-93af-bd97c327497b > 14/06/04 02:31:25 INFO spark.HttpServer: Starting HTTP Server > 14/06/04 02:31:25 INFO server.Server: jetty-8.y.z-SNAPSHOT > 14/06/04 02:31:26 INFO server.AbstractConnector: Started > SocketConnector@0.0.0.0:39958 > 14/06/04 02:31:26 INFO executor.CoarseGrainedExecutorBackend: Got assigned > task 2 > 14/06/04 02:31:26 INFO executor.Executor: Running task ID 2 > 14/06/04 02:31:26 ERROR executor.Executor: Exception in task ID 2 > java.io.InvalidClassException: scala.reflect.ClassTag$$anon$1; local class > incompatible: stream classdesc serialVersionUID = -8102093212602380348, local > class serialVersionUID = -4937928798201944954 > at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:678) > at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1678) > at > java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1573) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1827) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:76) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:607) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1078) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1949) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:76) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:607) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1078) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1949) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:76) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:607) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1078) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1949) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:76) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:607) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1078) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1949) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:76) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:607) > at java.lang.reflect.Method.invoke(Method.java:607) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1078) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1949) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2047) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1971) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1854) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) > at > org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) > at > org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) > at > java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1852) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1406) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:409) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:781) > 14/06/04 02:31:26 ERROR executor.CoarseGrainedExecutorBackend: Driver > Disassociated [akka.tcp://sparkExecutor@p7hvs7br16:39658] -> > [akka.tcp://spark@9.186.105.141:60253] disassociated! Shutting down. -- This message was sent by Atlassian JIRA (v6.2#6252)