Re: Upgrade from Spark 1.1.0 to 1.1.1+ Issues

Akhil Das Thu, 19 Mar 2015 23:38:19 -0700

Are you submitting your application from local to a remote host?
If you want to run the spark application from a remote machine, then you have
to at least set the following configurations properly.


 - *spark.driver.host* - points to the ip/host from where you are submitting
 the job (make sure you are able to ping this from the cluster)

 - *spark.driver.port* - set it to a port number which is accessible from
 the spark cluster.

 You can look at more configuration options over here.
<http://spark.apache.org/docs/latest/configuration.html#networking>


Thanks
Best Regards

On Fri, Mar 20, 2015 at 4:02 AM, Eason Hu <eas...@gmail.com> wrote:

> Hi Akhil,
>
> Thank you for your help.  I just found that the problem is related to my
> local spark application, since I ran it in IntelliJ and I didn't reload the
> project after I recompile the jar via maven.  If I didn't reload, it will
> use some local cache data to run the application which leads to two
> different versions.  After I reloaded the project and reran, it was running
> fine for v1.1.1 and I no longer saw that class incompatible issues.
>
> However, I now encounter a new issue starting from v1.2.0 and above.
>
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 15/03/19 01:10:17 INFO CoarseGrainedExecutorBackend: Registered signal 
> handlers for [TERM, HUP, INT]
> 15/03/19 01:10:17 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 15/03/19 01:10:17 INFO SecurityManager: Changing view acls to: hduser,eason.hu
> 15/03/19 01:10:17 INFO SecurityManager: Changing modify acls to: 
> hduser,eason.hu
> 15/03/19 01:10:17 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(hduser, 
> eason.hu); users with modify permissions: Set(hduser, eason.hu)
> 15/03/19 01:10:18 INFO Slf4jLogger: Slf4jLogger started
> 15/03/19 01:10:18 INFO Remoting: Starting remoting
> 15/03/19 01:10:18 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://driverPropsFetcher@hduser-07:59122]
> 15/03/19 01:10:18 INFO Utils: Successfully started service 
> 'driverPropsFetcher' on port 59122.
> 15/03/19 01:10:21 WARN ReliableDeliverySupervisor: Association with remote 
> system [akka.tcp://sparkDriver@192.168.1.53:65001] has failed, address is now 
> gated for [5000] ms. Reason is: [Association failed with 
> [akka.tcp://sparkDriver@192.168.1.53:65001]].
> 15/03/19 01:10:48 ERROR UserGroupInformation: PriviledgedActionException 
> as:eason.hu (auth:SIMPLE) cause:java.util.concurrent.TimeoutException: 
> Futures timed out after [30 seconds]
> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: 
> Unknown exception in doAs
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1421)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
> Caused by: java.security.PrivilegedActionException: 
> java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>       ... 4 more
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 
> seconds]
>       at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>       at 
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>       at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>       at 
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>       at scala.concurrent.Await$.result(package.scala:107)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:144)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
>       ... 7 more
>
> Do you have any clues why it happens only after v1.2.0 and above?  Nothing
> else changes.
>
> Thanks,
> Eason
>
> On Tue, Mar 17, 2015 at 8:39 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> Its clearly saying:
>>
>> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId;
>> local class incompatible: stream classdesc serialVersionUID =
>> 2439208141545036836, local class serialVersionUID = -7366074099953117729
>>
>> Version incompatibility, can you double check your version?
>> On 18 Mar 2015 06:08, "Eason Hu" <eas...@gmail.com> wrote:
>>
>>> Hi Akhil,
>>>
>>> sc.parallelize(1 to 10000).collect() in the Spark shell on Spark v1.2.0
>>> runs fine.  However, if I do the following remotely, it will throw
>>> exception:
>>>
>>> val sc : SparkContext = new SparkContext(conf)
>>>
>>>   val NUM_SAMPLES = 10
>>>   val count = sc.parallelize(1 to NUM_SAMPLES).map{i =>
>>>     val x = Math.random()
>>>     val y = Math.random()
>>>     if (x*x + y*y < 1) 1 else 0
>>>   }.reduce(_ + _)
>>>   println("Pi is roughly " + 4.0 * count / NUM_SAMPLES)
>>>
>>> Exception:
>>> 15/03/17 17:33:52 ERROR scheduler.TaskSchedulerImpl: Lost executor 1 on
>>> hcompute32228.sjc9.service-now.com: remote Akka client disassociated
>>> 15/03/17 17:33:52 INFO scheduler.TaskSetManager: Re-queueing tasks for 1
>>> from TaskSet 0.0
>>> 15/03/17 17:33:52 WARN scheduler.TaskSetManager: Lost task 1.1 in stage
>>> 0.0 (TID 3, hcompute32228): ExecutorLostFailure (executor lost)
>>> 15/03/17 17:33:52 INFO scheduler.DAGScheduler: Executor lost: 1 (epoch 3)
>>> 15/03/17 17:33:52 INFO storage.BlockManagerMasterActor: Trying to remove
>>> executor 1 from BlockManagerMaster.
>>> 15/03/17 17:33:52 INFO storage.BlockManagerMaster: Removed 1
>>> successfully in removeExecutor
>>> 15/03/17 17:34:39 ERROR Remoting:
>>> org.apache.spark.storage.BlockManagerId; local class incompatible: stream
>>> classdesc serialVersionUID = 2439208141545036836, local class
>>> serialVersionUID = -7366074099953117729
>>> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId;
>>> local class incompatible: stream classdesc serialVersionUID =
>>> 2439208141545036836, local class serialVersionUID = -7366074099953117729
>>>     at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604)
>>>     at
>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620)
>>>     at
>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515)
>>>
>>> v1.1.0 is totally fine, but v1.1.1 and v1.2.0+ are not.  Are there any
>>> special instruction to be Spark cluster for later versions?  Do you know if
>>> there are anything I'm missing?
>>>
>>>
>>> Thank you for your help,
>>> Eason
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Mar 16, 2015 at 11:51 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> Could you tell me what all you did to change the version of spark?
>>>>
>>>> Can you fireup a spark-shell and write this line and see what happens:
>>>>
>>>> sc.parallelize(1 to 10000).collect()
>>>>
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Mon, Mar 16, 2015 at 11:13 PM, Eason Hu <eas...@gmail.com> wrote:
>>>>
>>>>> Hi Akhil,
>>>>>
>>>>> Yes, I did change both versions on the project and the cluster.  Any
>>>>> clues?
>>>>>
>>>>> Even the sample code from Spark website failed to work.
>>>>>
>>>>> Thanks,
>>>>> Eason
>>>>>
>>>>> On Sun, Mar 15, 2015 at 11:56 PM, Akhil Das <
>>>>> ak...@sigmoidanalytics.com> wrote:
>>>>>
>>>>>> Did you change both the versions? The one in your build file of your
>>>>>> project and the spark version of your cluster?
>>>>>>
>>>>>> Thanks
>>>>>> Best Regards
>>>>>>
>>>>>> On Sat, Mar 14, 2015 at 6:47 AM, EH <eas...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I've been using Spark 1.1.0 for a while, and now would like to
>>>>>>> upgrade to
>>>>>>> Spark 1.1.1 or above.  However, it throws the following errors:
>>>>>>>
>>>>>>> 18:05:31.522 [sparkDriver-akka.actor.default-dispatcher-3hread] ERROR
>>>>>>> TaskSchedulerImpl - Lost executor 37 on hcompute001: remote Akka
>>>>>>> client
>>>>>>> disassociated
>>>>>>> 18:05:31.530 [sparkDriver-akka.actor.default-dispatcher-3hread] WARN
>>>>>>> TaskSetManager - Lost task 0.0 in stage 1.0 (TID 0, hcompute001):
>>>>>>> ExecutorLostFailure (executor lost)
>>>>>>> 18:05:31.567 [sparkDriver-akka.actor.default-dispatcher-2hread] ERROR
>>>>>>> TaskSchedulerImpl - Lost executor 3 on hcompute001: remote Akka
>>>>>>> client
>>>>>>> disassociated
>>>>>>> 18:05:31.568 [sparkDriver-akka.actor.default-dispatcher-2hread] WARN
>>>>>>> TaskSetManager - Lost task 1.0 in stage 1.0 (TID 1, hcompute001):
>>>>>>> ExecutorLostFailure (executor lost)
>>>>>>> 18:05:31.988 [sparkDriver-akka.actor.default-dispatcher-23hread]
>>>>>>> ERROR
>>>>>>> TaskSchedulerImpl - Lost executor 24 on hcompute001: remote Akka
>>>>>>> client
>>>>>>> disassociated
>>>>>>>
>>>>>>> Do you know what may go wrong?  I didn't change any codes, just
>>>>>>> changed the
>>>>>>> version of Spark.
>>>>>>>
>>>>>>> Thank you all,
>>>>>>> Eason
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Upgrade-from-Spark-1-1-0-to-1-1-1-Issues-tp22045.html
>>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>>> Nabble.com.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Re: Upgrade from Spark 1.1.0 to 1.1.1+ Issues

Reply via email to