Is the class that is not found in the wikipediapagerank jar? TD
On Wed, Jul 16, 2014 at 12:32 AM, Hao Wang <wh.s...@gmail.com> wrote: > Thanks for your reply. The SparkContext is configured as below: > > > sparkConf.setAppName("WikipediaPageRank") > > > sparkConf.set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > > > sparkConf.set("spark.kryo.registrator", > classOf[PRKryoRegistrator].getName) > > > val inputFile = args(0) > > > val threshold = args(1).toDouble > > > val numPartitions = args(2).toInt > > > val usePartitioner = args(3).toBoolean > > > > sparkConf.setAppName("WikipediaPageRank") > > > sparkConf.set("spark.executor.memory", "60g") > > > sparkConf.set("spark.cores.max", "48") > > > sparkConf.set("spark.kryoserializer.buffer.mb", "24") > > > val sc = new SparkContext(sparkConf) > > > > sc.addJar("~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar") > > > > And I use spark-submit to run the application: > > > ./bin/spark-submit --master spark://sing12:7077 --total-executor-cores 40 > --executor-memory 40g --class > org.apache.spark.examples.bagel.WikipediaPageRank > ~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar > hdfs://192.168.1.12:9000/freebase-26G 1 200 True > > > > Regards, > Wang Hao(王灏) > > CloudTeam | School of Software Engineering > Shanghai Jiao Tong University > Address:800 Dongchuan Road, Minhang District, Shanghai, 200240 > Email:wh.s...@gmail.com > > > On Wed, Jul 16, 2014 at 1:41 PM, Tathagata Das < > tathagata.das1...@gmail.com> wrote: > >> Are you using classes from external libraries that have not been added to >> the sparkContext, using sparkcontext.addJar()? >> >> TD >> >> >> On Tue, Jul 15, 2014 at 8:36 PM, Hao Wang <wh.s...@gmail.com> wrote: >> >>> I am running the WikipediaPageRank in Spark example and share the same >>> problem with you: >>> >>> 4/07/16 11:31:06 DEBUG DAGScheduler: submitStage(Stage 6) >>> 14/07/16 11:31:06 ERROR TaskSetManager: Task 6.0:450 failed 4 times; >>> aborting job >>> 14/07/16 11:31:06 INFO DAGScheduler: Failed to run foreach at >>> Bagel.scala:251 >>> Exception in thread "main" 14/07/16 11:31:06 INFO TaskSchedulerImpl: >>> Cancelling stage 6 >>> org.apache.spark.SparkException: Job aborted due to stage failure: Task >>> 6.0:450 failed 4 times, most recent failure: Exception failure in TID 1330 >>> on host sing11: com.esotericsoftware.kryo.KryoException: Unable to find >>> class: arl Fridtjof Rode >>> >>> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) >>> >>> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) >>> com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) >>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) >>> >>> com.twitter.chill.TraversableSerializer.read(Traversable.scala:44) >>> >>> com.twitter.chill.TraversableSerializer.read(Traversable.scala:21) >>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) >>> >>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:115) >>> >>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125) >>> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) >>> >>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) >>> scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) >>> >>> org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58) >>> >>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96) >>> >>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95) >>> org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582) >>> >>> Anyone cloud help? >>> >>> Regards, >>> Wang Hao(王灏) >>> >>> CloudTeam | School of Software Engineering >>> Shanghai Jiao Tong University >>> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240 >>> Email:wh.s...@gmail.com >>> >>> >>> On Tue, Jun 3, 2014 at 8:02 PM, Denes <te...@outlook.com> wrote: >>> >>>> I tried to use Kryo as a serialiser isn spark streaming, did everything >>>> according to the guide posted on the spark website, i.e. added the >>>> following >>>> lines: >>>> >>>> conf.set("spark.serializer", >>>> "org.apache.spark.serializer.KryoSerializer"); >>>> conf.set("spark.kryo.registrator", "MyKryoRegistrator"); >>>> >>>> I also added the necessary classes to the MyKryoRegistrator. >>>> >>>> However I get the following strange error, can someone help me out >>>> where to >>>> look for a solution? >>>> >>>> 14/06/03 09:00:49 ERROR scheduler.JobScheduler: Error running job >>>> streaming >>>> job 1401778800000 ms.0 >>>> org.apache.spark.SparkException: Job aborted due to stage failure: >>>> Exception >>>> while deserializing and fetching task: >>>> com.esotericsoftware.kryo.KryoException: Unable to find class: J >>>> Serialization trace: >>>> id (org.apache.spark.storage.GetBlock) >>>> at >>>> org.apache.spark.scheduler.DAGScheduler.org >>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) >>>> at >>>> >>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>>> at >>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) >>>> at scala.Option.foreach(Option.scala:236) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) >>>> at >>>> >>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) >>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) >>>> at akka.actor.ActorCell.invoke(ActorCell.scala:456) >>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) >>>> at akka.dispatch.Mailbox.run(Mailbox.scala:219) >>>> at >>>> >>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >>>> at >>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>>> at >>>> >>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>>> at >>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>>> at >>>> >>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Kyro-deserialisation-error-tp6798.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>> >>> >> >