Are you submitting your application from local to a remote host? If you want to run the spark application from a remote machine, then you have to at least set the following configurations properly.
- *spark.driver.host* - points to the ip/host from where you are submitting the job (make sure you are able to ping this from the cluster) - *spark.driver.port* - set it to a port number which is accessible from the spark cluster. You can look at more configuration options over here. <http://spark.apache.org/docs/latest/configuration.html#networking> Thanks Best Regards On Fri, Mar 20, 2015 at 4:02 AM, Eason Hu <eas...@gmail.com> wrote: > Hi Akhil, > > Thank you for your help. I just found that the problem is related to my > local spark application, since I ran it in IntelliJ and I didn't reload the > project after I recompile the jar via maven. If I didn't reload, it will > use some local cache data to run the application which leads to two > different versions. After I reloaded the project and reran, it was running > fine for v1.1.1 and I no longer saw that class incompatible issues. > > However, I now encounter a new issue starting from v1.2.0 and above. > > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 15/03/19 01:10:17 INFO CoarseGrainedExecutorBackend: Registered signal > handlers for [TERM, HUP, INT] > 15/03/19 01:10:17 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 15/03/19 01:10:17 INFO SecurityManager: Changing view acls to: hduser,eason.hu > 15/03/19 01:10:17 INFO SecurityManager: Changing modify acls to: > hduser,eason.hu > 15/03/19 01:10:17 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(hduser, > eason.hu); users with modify permissions: Set(hduser, eason.hu) > 15/03/19 01:10:18 INFO Slf4jLogger: Slf4jLogger started > 15/03/19 01:10:18 INFO Remoting: Starting remoting > 15/03/19 01:10:18 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://driverPropsFetcher@hduser-07:59122] > 15/03/19 01:10:18 INFO Utils: Successfully started service > 'driverPropsFetcher' on port 59122. > 15/03/19 01:10:21 WARN ReliableDeliverySupervisor: Association with remote > system [akka.tcp://sparkDriver@192.168.1.53:65001] has failed, address is now > gated for [5000] ms. Reason is: [Association failed with > [akka.tcp://sparkDriver@192.168.1.53:65001]]. > 15/03/19 01:10:48 ERROR UserGroupInformation: PriviledgedActionException > as:eason.hu (auth:SIMPLE) cause:java.util.concurrent.TimeoutException: > Futures timed out after [30 seconds] > Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: > Unknown exception in doAs > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1421) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) > Caused by: java.security.PrivilegedActionException: > java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > ... 4 more > Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 > seconds] > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) > at > scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) > at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) > at > scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) > at scala.concurrent.Await$.result(package.scala:107) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:144) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59) > ... 7 more > > Do you have any clues why it happens only after v1.2.0 and above? Nothing > else changes. > > Thanks, > Eason > > On Tue, Mar 17, 2015 at 8:39 PM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> Its clearly saying: >> >> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; >> local class incompatible: stream classdesc serialVersionUID = >> 2439208141545036836, local class serialVersionUID = -7366074099953117729 >> >> Version incompatibility, can you double check your version? >> On 18 Mar 2015 06:08, "Eason Hu" <eas...@gmail.com> wrote: >> >>> Hi Akhil, >>> >>> sc.parallelize(1 to 10000).collect() in the Spark shell on Spark v1.2.0 >>> runs fine. However, if I do the following remotely, it will throw >>> exception: >>> >>> val sc : SparkContext = new SparkContext(conf) >>> >>> val NUM_SAMPLES = 10 >>> val count = sc.parallelize(1 to NUM_SAMPLES).map{i => >>> val x = Math.random() >>> val y = Math.random() >>> if (x*x + y*y < 1) 1 else 0 >>> }.reduce(_ + _) >>> println("Pi is roughly " + 4.0 * count / NUM_SAMPLES) >>> >>> Exception: >>> 15/03/17 17:33:52 ERROR scheduler.TaskSchedulerImpl: Lost executor 1 on >>> hcompute32228.sjc9.service-now.com: remote Akka client disassociated >>> 15/03/17 17:33:52 INFO scheduler.TaskSetManager: Re-queueing tasks for 1 >>> from TaskSet 0.0 >>> 15/03/17 17:33:52 WARN scheduler.TaskSetManager: Lost task 1.1 in stage >>> 0.0 (TID 3, hcompute32228): ExecutorLostFailure (executor lost) >>> 15/03/17 17:33:52 INFO scheduler.DAGScheduler: Executor lost: 1 (epoch 3) >>> 15/03/17 17:33:52 INFO storage.BlockManagerMasterActor: Trying to remove >>> executor 1 from BlockManagerMaster. >>> 15/03/17 17:33:52 INFO storage.BlockManagerMaster: Removed 1 >>> successfully in removeExecutor >>> 15/03/17 17:34:39 ERROR Remoting: >>> org.apache.spark.storage.BlockManagerId; local class incompatible: stream >>> classdesc serialVersionUID = 2439208141545036836, local class >>> serialVersionUID = -7366074099953117729 >>> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; >>> local class incompatible: stream classdesc serialVersionUID = >>> 2439208141545036836, local class serialVersionUID = -7366074099953117729 >>> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604) >>> at >>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) >>> at >>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) >>> >>> v1.1.0 is totally fine, but v1.1.1 and v1.2.0+ are not. Are there any >>> special instruction to be Spark cluster for later versions? Do you know if >>> there are anything I'm missing? >>> >>> >>> Thank you for your help, >>> Eason >>> >>> >>> >>> >>> >>> On Mon, Mar 16, 2015 at 11:51 PM, Akhil Das <ak...@sigmoidanalytics.com> >>> wrote: >>> >>>> Could you tell me what all you did to change the version of spark? >>>> >>>> Can you fireup a spark-shell and write this line and see what happens: >>>> >>>> sc.parallelize(1 to 10000).collect() >>>> >>>> >>>> Thanks >>>> Best Regards >>>> >>>> On Mon, Mar 16, 2015 at 11:13 PM, Eason Hu <eas...@gmail.com> wrote: >>>> >>>>> Hi Akhil, >>>>> >>>>> Yes, I did change both versions on the project and the cluster. Any >>>>> clues? >>>>> >>>>> Even the sample code from Spark website failed to work. >>>>> >>>>> Thanks, >>>>> Eason >>>>> >>>>> On Sun, Mar 15, 2015 at 11:56 PM, Akhil Das < >>>>> ak...@sigmoidanalytics.com> wrote: >>>>> >>>>>> Did you change both the versions? The one in your build file of your >>>>>> project and the spark version of your cluster? >>>>>> >>>>>> Thanks >>>>>> Best Regards >>>>>> >>>>>> On Sat, Mar 14, 2015 at 6:47 AM, EH <eas...@gmail.com> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I've been using Spark 1.1.0 for a while, and now would like to >>>>>>> upgrade to >>>>>>> Spark 1.1.1 or above. However, it throws the following errors: >>>>>>> >>>>>>> 18:05:31.522 [sparkDriver-akka.actor.default-dispatcher-3hread] ERROR >>>>>>> TaskSchedulerImpl - Lost executor 37 on hcompute001: remote Akka >>>>>>> client >>>>>>> disassociated >>>>>>> 18:05:31.530 [sparkDriver-akka.actor.default-dispatcher-3hread] WARN >>>>>>> TaskSetManager - Lost task 0.0 in stage 1.0 (TID 0, hcompute001): >>>>>>> ExecutorLostFailure (executor lost) >>>>>>> 18:05:31.567 [sparkDriver-akka.actor.default-dispatcher-2hread] ERROR >>>>>>> TaskSchedulerImpl - Lost executor 3 on hcompute001: remote Akka >>>>>>> client >>>>>>> disassociated >>>>>>> 18:05:31.568 [sparkDriver-akka.actor.default-dispatcher-2hread] WARN >>>>>>> TaskSetManager - Lost task 1.0 in stage 1.0 (TID 1, hcompute001): >>>>>>> ExecutorLostFailure (executor lost) >>>>>>> 18:05:31.988 [sparkDriver-akka.actor.default-dispatcher-23hread] >>>>>>> ERROR >>>>>>> TaskSchedulerImpl - Lost executor 24 on hcompute001: remote Akka >>>>>>> client >>>>>>> disassociated >>>>>>> >>>>>>> Do you know what may go wrong? I didn't change any codes, just >>>>>>> changed the >>>>>>> version of Spark. >>>>>>> >>>>>>> Thank you all, >>>>>>> Eason >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Upgrade-from-Spark-1-1-0-to-1-1-1-Issues-tp22045.html >>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>> Nabble.com. >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >