Hi Akhil, Thank you for your help. I just found that the problem is related to my local spark application, since I ran it in IntelliJ and I didn't reload the project after I recompile the jar via maven. If I didn't reload, it will use some local cache data to run the application which leads to two different versions. After I reloaded the project and reran, it was running fine for v1.1.1 and I no longer saw that class incompatible issues.
However, I now encounter a new issue starting from v1.2.0 and above. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/03/19 01:10:17 INFO CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT] 15/03/19 01:10:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/03/19 01:10:17 INFO SecurityManager: Changing view acls to: hduser,eason.hu 15/03/19 01:10:17 INFO SecurityManager: Changing modify acls to: hduser,eason.hu 15/03/19 01:10:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hduser, eason.hu); users with modify permissions: Set(hduser, eason.hu) 15/03/19 01:10:18 INFO Slf4jLogger: Slf4jLogger started 15/03/19 01:10:18 INFO Remoting: Starting remoting 15/03/19 01:10:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@hduser-07:59122] 15/03/19 01:10:18 INFO Utils: Successfully started service 'driverPropsFetcher' on port 59122. 15/03/19 01:10:21 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@192.168.1.53:65001] has failed, address is now gated for [5000] ms. Reason is: [Association failed with [akka.tcp://sparkDriver@192.168.1.53:65001]]. 15/03/19 01:10:48 ERROR UserGroupInformation: PriviledgedActionException as:eason.hu (auth:SIMPLE) cause:java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1421) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: java.security.PrivilegedActionException: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) ... 4 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:144) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59) ... 7 more Do you have any clues why it happens only after v1.2.0 and above? Nothing else changes. Thanks, Eason On Tue, Mar 17, 2015 at 8:39 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Its clearly saying: > > java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; > local class incompatible: stream classdesc serialVersionUID = > 2439208141545036836, local class serialVersionUID = -7366074099953117729 > > Version incompatibility, can you double check your version? > On 18 Mar 2015 06:08, "Eason Hu" <eas...@gmail.com> wrote: > >> Hi Akhil, >> >> sc.parallelize(1 to 10000).collect() in the Spark shell on Spark v1.2.0 >> runs fine. However, if I do the following remotely, it will throw >> exception: >> >> val sc : SparkContext = new SparkContext(conf) >> >> val NUM_SAMPLES = 10 >> val count = sc.parallelize(1 to NUM_SAMPLES).map{i => >> val x = Math.random() >> val y = Math.random() >> if (x*x + y*y < 1) 1 else 0 >> }.reduce(_ + _) >> println("Pi is roughly " + 4.0 * count / NUM_SAMPLES) >> >> Exception: >> 15/03/17 17:33:52 ERROR scheduler.TaskSchedulerImpl: Lost executor 1 on >> hcompute32228.sjc9.service-now.com: remote Akka client disassociated >> 15/03/17 17:33:52 INFO scheduler.TaskSetManager: Re-queueing tasks for 1 >> from TaskSet 0.0 >> 15/03/17 17:33:52 WARN scheduler.TaskSetManager: Lost task 1.1 in stage >> 0.0 (TID 3, hcompute32228): ExecutorLostFailure (executor lost) >> 15/03/17 17:33:52 INFO scheduler.DAGScheduler: Executor lost: 1 (epoch 3) >> 15/03/17 17:33:52 INFO storage.BlockManagerMasterActor: Trying to remove >> executor 1 from BlockManagerMaster. >> 15/03/17 17:33:52 INFO storage.BlockManagerMaster: Removed 1 successfully >> in removeExecutor >> 15/03/17 17:34:39 ERROR Remoting: >> org.apache.spark.storage.BlockManagerId; local class incompatible: stream >> classdesc serialVersionUID = 2439208141545036836, local class >> serialVersionUID = -7366074099953117729 >> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; >> local class incompatible: stream classdesc serialVersionUID = >> 2439208141545036836, local class serialVersionUID = -7366074099953117729 >> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604) >> at >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) >> at >> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) >> >> v1.1.0 is totally fine, but v1.1.1 and v1.2.0+ are not. Are there any >> special instruction to be Spark cluster for later versions? Do you know if >> there are anything I'm missing? >> >> >> Thank you for your help, >> Eason >> >> >> >> >> >> On Mon, Mar 16, 2015 at 11:51 PM, Akhil Das <ak...@sigmoidanalytics.com> >> wrote: >> >>> Could you tell me what all you did to change the version of spark? >>> >>> Can you fireup a spark-shell and write this line and see what happens: >>> >>> sc.parallelize(1 to 10000).collect() >>> >>> >>> Thanks >>> Best Regards >>> >>> On Mon, Mar 16, 2015 at 11:13 PM, Eason Hu <eas...@gmail.com> wrote: >>> >>>> Hi Akhil, >>>> >>>> Yes, I did change both versions on the project and the cluster. Any >>>> clues? >>>> >>>> Even the sample code from Spark website failed to work. >>>> >>>> Thanks, >>>> Eason >>>> >>>> On Sun, Mar 15, 2015 at 11:56 PM, Akhil Das <ak...@sigmoidanalytics.com >>>> > wrote: >>>> >>>>> Did you change both the versions? The one in your build file of your >>>>> project and the spark version of your cluster? >>>>> >>>>> Thanks >>>>> Best Regards >>>>> >>>>> On Sat, Mar 14, 2015 at 6:47 AM, EH <eas...@gmail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I've been using Spark 1.1.0 for a while, and now would like to >>>>>> upgrade to >>>>>> Spark 1.1.1 or above. However, it throws the following errors: >>>>>> >>>>>> 18:05:31.522 [sparkDriver-akka.actor.default-dispatcher-3hread] ERROR >>>>>> TaskSchedulerImpl - Lost executor 37 on hcompute001: remote Akka >>>>>> client >>>>>> disassociated >>>>>> 18:05:31.530 [sparkDriver-akka.actor.default-dispatcher-3hread] WARN >>>>>> TaskSetManager - Lost task 0.0 in stage 1.0 (TID 0, hcompute001): >>>>>> ExecutorLostFailure (executor lost) >>>>>> 18:05:31.567 [sparkDriver-akka.actor.default-dispatcher-2hread] ERROR >>>>>> TaskSchedulerImpl - Lost executor 3 on hcompute001: remote Akka client >>>>>> disassociated >>>>>> 18:05:31.568 [sparkDriver-akka.actor.default-dispatcher-2hread] WARN >>>>>> TaskSetManager - Lost task 1.0 in stage 1.0 (TID 1, hcompute001): >>>>>> ExecutorLostFailure (executor lost) >>>>>> 18:05:31.988 [sparkDriver-akka.actor.default-dispatcher-23hread] ERROR >>>>>> TaskSchedulerImpl - Lost executor 24 on hcompute001: remote Akka >>>>>> client >>>>>> disassociated >>>>>> >>>>>> Do you know what may go wrong? I didn't change any codes, just >>>>>> changed the >>>>>> version of Spark. >>>>>> >>>>>> Thank you all, >>>>>> Eason >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Upgrade-from-Spark-1-1-0-to-1-1-1-Issues-tp22045.html >>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>>> >>> >>