Re: Upgrade from Spark 1.1.0 to 1.1.1+ Issues

Eason Hu Thu, 19 Mar 2015 15:33:37 -0700

Hi Akhil,

Thank you for your help.  I just found that the problem is related to my
local spark application, since I ran it in IntelliJ and I didn't reload the
project after I recompile the jar via maven.  If I didn't reload, it will
use some local cache data to run the application which leads to two
different versions.  After I reloaded the project and reran, it was running
fine for v1.1.1 and I no longer saw that class incompatible issues.


However, I now encounter a new issue starting from v1.2.0 and above.

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/03/19 01:10:17 INFO CoarseGrainedExecutorBackend: Registered signal
handlers for [TERM, HUP, INT]
15/03/19 01:10:17 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where
applicable
15/03/19 01:10:17 INFO SecurityManager: Changing view acls to: hduser,eason.hu
15/03/19 01:10:17 INFO SecurityManager: Changing modify acls to: hduser,eason.hu
15/03/19 01:10:17 INFO SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view
permissions: Set(hduser, eason.hu); users with modify permissions:
Set(hduser, eason.hu)
15/03/19 01:10:18 INFO Slf4jLogger: Slf4jLogger started
15/03/19 01:10:18 INFO Remoting: Starting remoting
15/03/19 01:10:18 INFO Remoting: Remoting started; listening on
addresses :[akka.tcp://driverPropsFetcher@hduser-07:59122]
15/03/19 01:10:18 INFO Utils: Successfully started service
'driverPropsFetcher' on port 59122.
15/03/19 01:10:21 WARN ReliableDeliverySupervisor: Association with
remote system [akka.tcp://sparkDriver@192.168.1.53:65001] has failed,
address is now gated for [5000] ms. Reason is: [Association failed
with [akka.tcp://sparkDriver@192.168.1.53:65001]].
15/03/19 01:10:48 ERROR UserGroupInformation:
PriviledgedActionException as:eason.hu (auth:SIMPLE)
cause:java.util.concurrent.TimeoutException: Futures timed out after
[30 seconds]
Exception in thread "main"
java.lang.reflect.UndeclaredThrowableException: Unknown exception in
doAs
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1421)
        at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.security.PrivilegedActionException:
java.util.concurrent.TimeoutException: Futures timed out after [30
seconds]
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        ... 4 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out
after [30 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:144)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
        ... 7 more

Do you have any clues why it happens only after v1.2.0 and above?  Nothing
else changes.

Thanks,
Eason

On Tue, Mar 17, 2015 at 8:39 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Its clearly saying:
>
> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId;
> local class incompatible: stream classdesc serialVersionUID =
> 2439208141545036836, local class serialVersionUID = -7366074099953117729
>
> Version incompatibility, can you double check your version?
> On 18 Mar 2015 06:08, "Eason Hu" <eas...@gmail.com> wrote:
>
>> Hi Akhil,
>>
>> sc.parallelize(1 to 10000).collect() in the Spark shell on Spark v1.2.0
>> runs fine.  However, if I do the following remotely, it will throw
>> exception:
>>
>> val sc : SparkContext = new SparkContext(conf)
>>
>>   val NUM_SAMPLES = 10
>>   val count = sc.parallelize(1 to NUM_SAMPLES).map{i =>
>>     val x = Math.random()
>>     val y = Math.random()
>>     if (x*x + y*y < 1) 1 else 0
>>   }.reduce(_ + _)
>>   println("Pi is roughly " + 4.0 * count / NUM_SAMPLES)
>>
>> Exception:
>> 15/03/17 17:33:52 ERROR scheduler.TaskSchedulerImpl: Lost executor 1 on
>> hcompute32228.sjc9.service-now.com: remote Akka client disassociated
>> 15/03/17 17:33:52 INFO scheduler.TaskSetManager: Re-queueing tasks for 1
>> from TaskSet 0.0
>> 15/03/17 17:33:52 WARN scheduler.TaskSetManager: Lost task 1.1 in stage
>> 0.0 (TID 3, hcompute32228): ExecutorLostFailure (executor lost)
>> 15/03/17 17:33:52 INFO scheduler.DAGScheduler: Executor lost: 1 (epoch 3)
>> 15/03/17 17:33:52 INFO storage.BlockManagerMasterActor: Trying to remove
>> executor 1 from BlockManagerMaster.
>> 15/03/17 17:33:52 INFO storage.BlockManagerMaster: Removed 1 successfully
>> in removeExecutor
>> 15/03/17 17:34:39 ERROR Remoting:
>> org.apache.spark.storage.BlockManagerId; local class incompatible: stream
>> classdesc serialVersionUID = 2439208141545036836, local class
>> serialVersionUID = -7366074099953117729
>> java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId;
>> local class incompatible: stream classdesc serialVersionUID =
>> 2439208141545036836, local class serialVersionUID = -7366074099953117729
>>     at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604)
>>     at
>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620)
>>     at
>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515)
>>
>> v1.1.0 is totally fine, but v1.1.1 and v1.2.0+ are not.  Are there any
>> special instruction to be Spark cluster for later versions?  Do you know if
>> there are anything I'm missing?
>>
>>
>> Thank you for your help,
>> Eason
>>
>>
>>
>>
>>
>> On Mon, Mar 16, 2015 at 11:51 PM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>>> Could you tell me what all you did to change the version of spark?
>>>
>>> Can you fireup a spark-shell and write this line and see what happens:
>>>
>>> sc.parallelize(1 to 10000).collect()
>>>
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Mon, Mar 16, 2015 at 11:13 PM, Eason Hu <eas...@gmail.com> wrote:
>>>
>>>> Hi Akhil,
>>>>
>>>> Yes, I did change both versions on the project and the cluster.  Any
>>>> clues?
>>>>
>>>> Even the sample code from Spark website failed to work.
>>>>
>>>> Thanks,
>>>> Eason
>>>>
>>>> On Sun, Mar 15, 2015 at 11:56 PM, Akhil Das <ak...@sigmoidanalytics.com
>>>> > wrote:
>>>>
>>>>> Did you change both the versions? The one in your build file of your
>>>>> project and the spark version of your cluster?
>>>>>
>>>>> Thanks
>>>>> Best Regards
>>>>>
>>>>> On Sat, Mar 14, 2015 at 6:47 AM, EH <eas...@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I've been using Spark 1.1.0 for a while, and now would like to
>>>>>> upgrade to
>>>>>> Spark 1.1.1 or above.  However, it throws the following errors:
>>>>>>
>>>>>> 18:05:31.522 [sparkDriver-akka.actor.default-dispatcher-3hread] ERROR
>>>>>> TaskSchedulerImpl - Lost executor 37 on hcompute001: remote Akka
>>>>>> client
>>>>>> disassociated
>>>>>> 18:05:31.530 [sparkDriver-akka.actor.default-dispatcher-3hread] WARN
>>>>>> TaskSetManager - Lost task 0.0 in stage 1.0 (TID 0, hcompute001):
>>>>>> ExecutorLostFailure (executor lost)
>>>>>> 18:05:31.567 [sparkDriver-akka.actor.default-dispatcher-2hread] ERROR
>>>>>> TaskSchedulerImpl - Lost executor 3 on hcompute001: remote Akka client
>>>>>> disassociated
>>>>>> 18:05:31.568 [sparkDriver-akka.actor.default-dispatcher-2hread] WARN
>>>>>> TaskSetManager - Lost task 1.0 in stage 1.0 (TID 1, hcompute001):
>>>>>> ExecutorLostFailure (executor lost)
>>>>>> 18:05:31.988 [sparkDriver-akka.actor.default-dispatcher-23hread] ERROR
>>>>>> TaskSchedulerImpl - Lost executor 24 on hcompute001: remote Akka
>>>>>> client
>>>>>> disassociated
>>>>>>
>>>>>> Do you know what may go wrong?  I didn't change any codes, just
>>>>>> changed the
>>>>>> version of Spark.
>>>>>>
>>>>>> Thank you all,
>>>>>> Eason
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Upgrade-from-Spark-1-1-0-to-1-1-1-Issues-tp22045.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

Re: Upgrade from Spark 1.1.0 to 1.1.1+ Issues

Reply via email to