Is the class that is not found in the wikipediapagerank jar?

TD


On Wed, Jul 16, 2014 at 12:32 AM, Hao Wang <wh.s...@gmail.com> wrote:

> Thanks for your reply. The SparkContext is configured as below:
>
>
>  sparkConf.setAppName("WikipediaPageRank")
>
>
>     sparkConf.set("spark.serializer", 
> "org.apache.spark.serializer.KryoSerializer")
>
>
>     sparkConf.set("spark.kryo.registrator",  
> classOf[PRKryoRegistrator].getName)
>
>
>     val inputFile = args(0)
>
>
>     val threshold = args(1).toDouble
>
>
>     val numPartitions = args(2).toInt
>
>
>     val usePartitioner = args(3).toBoolean
>
>
>
>     sparkConf.setAppName("WikipediaPageRank")
>
>
>     sparkConf.set("spark.executor.memory", "60g")
>
>
>     sparkConf.set("spark.cores.max", "48")
>
>
>     sparkConf.set("spark.kryoserializer.buffer.mb", "24")
>
>
>     val sc = new SparkContext(sparkConf)
>
>
>     
> sc.addJar("~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar")
>
>
>
> And I use spark-submit to run the application:
>
>
> ./bin/spark-submit --master spark://sing12:7077  --total-executor-cores 40 
> --executor-memory 40g --class 
> org.apache.spark.examples.bagel.WikipediaPageRank 
> ~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar 
> hdfs://192.168.1.12:9000/freebase-26G 1 200 True
>
>
>
> Regards,
> Wang Hao(王灏)
>
> CloudTeam | School of Software Engineering
> Shanghai Jiao Tong University
> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
> Email:wh.s...@gmail.com
>
>
> On Wed, Jul 16, 2014 at 1:41 PM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> Are you using classes from external libraries that have not been added to
>> the sparkContext, using sparkcontext.addJar()?
>>
>> TD
>>
>>
>> On Tue, Jul 15, 2014 at 8:36 PM, Hao Wang <wh.s...@gmail.com> wrote:
>>
>>> I am running the WikipediaPageRank in Spark example and share the same
>>> problem with you:
>>>
>>> 4/07/16 11:31:06 DEBUG DAGScheduler: submitStage(Stage 6)
>>> 14/07/16 11:31:06 ERROR TaskSetManager: Task 6.0:450 failed 4 times;
>>> aborting job
>>> 14/07/16 11:31:06 INFO DAGScheduler: Failed to run foreach at
>>> Bagel.scala:251
>>> Exception in thread "main" 14/07/16 11:31:06 INFO TaskSchedulerImpl:
>>> Cancelling stage 6
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>>> 6.0:450 failed 4 times, most recent failure: Exception failure in TID 1330
>>> on host sing11: com.esotericsoftware.kryo.KryoException: Unable to find
>>> class: arl Fridtjof Rode
>>>
>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>>>
>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>>>         com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
>>>         com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
>>>
>>> com.twitter.chill.TraversableSerializer.read(Traversable.scala:44)
>>>
>>> com.twitter.chill.TraversableSerializer.read(Traversable.scala:21)
>>>         com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>
>>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:115)
>>>
>>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
>>>         org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
>>>
>>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>>         scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>>
>>> org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
>>>
>>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
>>>
>>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
>>>         org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
>>>
>>> Anyone cloud help?
>>>
>>> Regards,
>>> Wang Hao(王灏)
>>>
>>> CloudTeam | School of Software Engineering
>>> Shanghai Jiao Tong University
>>> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
>>> Email:wh.s...@gmail.com
>>>
>>>
>>> On Tue, Jun 3, 2014 at 8:02 PM, Denes <te...@outlook.com> wrote:
>>>
>>>> I tried to use Kryo as a serialiser isn spark streaming, did everything
>>>> according to the guide posted on the spark website, i.e. added the
>>>> following
>>>> lines:
>>>>
>>>> conf.set("spark.serializer",
>>>> "org.apache.spark.serializer.KryoSerializer");
>>>> conf.set("spark.kryo.registrator", "MyKryoRegistrator");
>>>>
>>>> I also added the necessary classes to the MyKryoRegistrator.
>>>>
>>>> However I get the following strange error, can someone help me out
>>>> where to
>>>> look for a solution?
>>>>
>>>> 14/06/03 09:00:49 ERROR scheduler.JobScheduler: Error running job
>>>> streaming
>>>> job 1401778800000 ms.0
>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>>> Exception
>>>> while deserializing and fetching task:
>>>> com.esotericsoftware.kryo.KryoException: Unable to find class: J
>>>> Serialization trace:
>>>> id (org.apache.spark.storage.GetBlock)
>>>>         at
>>>> org.apache.spark.scheduler.DAGScheduler.org
>>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
>>>>         at
>>>>
>>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>>>         at
>>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>>>>         at scala.Option.foreach(Option.scala:236)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
>>>>         at
>>>>
>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
>>>>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>>>         at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>>>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>>>         at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>>>         at
>>>>
>>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>>>         at
>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>         at
>>>>
>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>         at
>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>         at
>>>>
>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Kyro-deserialisation-error-tp6798.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>
>>>
>>
>

Reply via email to