yeah CDH distribution (1.1). On Wed Nov 19 2014 at 5:29:39 PM Marcelo Vanzin <van...@cloudera.com> wrote:
> On Wed, Nov 19, 2014 at 2:13 PM, Anson Abraham <anson.abra...@gmail.com> > wrote: > > yeah but in this case i'm not building any files. just deployed out > config > > files in CDH5.2 and initiated a spark-shell to just read and output a > file. > > In that case it is a little bit weird. Just to be sure, you are using > CDH's version of Spark, not trying to run an Apache Spark release on > top of CDH, right? (If that's the case, then we could probably move > this conversation to cdh-us...@cloudera.org, since it would be > CDH-specific.) > > > > On Wed Nov 19 2014 at 4:52:51 PM Marcelo Vanzin <van...@cloudera.com> > wrote: > >> > >> Hi Anson, > >> > >> We've seen this error when incompatible classes are used in the driver > >> and executors (e.g., same class name, but the classes are different > >> and thus the serialized data is different). This can happen for > >> example if you're including some 3rd party libraries in your app's > >> jar, or changing the driver/executor class paths to include these > >> conflicting libraries. > >> > >> Can you clarify whether any of the above apply to your case? > >> > >> (For example, one easy way to trigger this is to add the > >> spark-examples jar shipped with CDH5.2 in the classpath of your > >> driver. That's one of the reasons I filed SPARK-4048, but I digress.) > >> > >> > >> On Tue, Nov 18, 2014 at 1:59 PM, Anson Abraham <anson.abra...@gmail.com > > > >> wrote: > >> > I'm essentially loading a file and saving output to another location: > >> > > >> > val source = sc.textFile("/tmp/testfile.txt") > >> > source.saveAsTextFile("/tmp/testsparkoutput") > >> > > >> > when i do so, i'm hitting this error: > >> > 14/11/18 21:15:08 INFO DAGScheduler: Failed to run saveAsTextFile at > >> > <console>:15 > >> > org.apache.spark.SparkException: Job aborted due to stage failure: > Task > >> > 0 in > >> > stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage > >> > 0.0 > >> > (TID 6, cloudera-1.testdomain.net): java.lang.IllegalStateException: > >> > unread > >> > block data > >> > > >> > > >> > java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode( > ObjectInputStream.java:2421) > >> > > >> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) > >> > > >> > java.io.ObjectInputStream.defaultReadFields( > ObjectInputStream.java:1990) > >> > > >> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > >> > > >> > > >> > java.io.ObjectInputStream.readOrdinaryObject( > ObjectInputStream.java:1798) > >> > > >> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > >> > java.io.ObjectInputStream.readObject(ObjectInputStream. > java:370) > >> > > >> > > >> > org.apache.spark.serializer.JavaDeserializationStream. > readObject(JavaSerializer.scala:62) > >> > > >> > > >> > org.apache.spark.serializer.JavaSerializerInstance. > deserialize(JavaSerializer.scala:87) > >> > > >> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:162) > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > >> > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > >> > java.lang.Thread.run(Thread.java:744) > >> > Driver stacktrace: > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$ > scheduler$DAGScheduler$$failJobAndIndependentStages( > DAGScheduler.scala:1185) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply( > DAGScheduler.scala:1174) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply( > DAGScheduler.scala:1173) > >> > at > >> > > >> > scala.collection.mutable.ResizableArray$class.foreach( > ResizableArray.scala:59) > >> > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler.abortStage( > DAGScheduler.scala:1173) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler$$anonfun$ > handleTaskSetFailed$1.apply(DAGScheduler.scala:688) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler$$anonfun$ > handleTaskSetFailed$1.apply(DAGScheduler.scala:688) > >> > at scala.Option.foreach(Option.scala:236) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed( > DAGScheduler.scala:688) > >> > at > >> > > >> > org.apache.spark.scheduler.DAGSchedulerEventProcessActor$ > $anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391) > >> > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > >> > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > >> > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > >> > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > >> > at > >> > > >> > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec( > AbstractDispatcher.scala:386) > >> > at scala.concurrent.forkjoin.ForkJoinTask.doExec( > ForkJoinTask.java:260) > >> > at > >> > > >> > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue. > runTask(ForkJoinPool.java:1339) > >> > at > >> > scala.concurrent.forkjoin.ForkJoinPool.runWorker( > ForkJoinPool.java:1979) > >> > at > >> > > >> > scala.concurrent.forkjoin.ForkJoinWorkerThread.run( > ForkJoinWorkerThread.java:107) > >> > > >> > > >> > Cant figure out what the issue is. I'm running in CDH5.2 w/ version > of > >> > spark being 1.1. The file i'm loading is literally just 7 MB. I > >> > thought it > >> > was jar files mismatch, but i did a compare and see they're all > >> > identical. > >> > But seeing as how they were all installed through CDH parcels, not > sure > >> > how > >> > there would be version mismatch on the nodes and master. Oh yeah 1 > >> > master > >> > node w/ 2 worker nodes and running in standalone not through yarn. So > >> > as a > >> > just in case, i copied the jars from the master to the 2 worker nodes > as > >> > just in case, and still same issue. > >> > Weird thing is, first time i installed and tested it out, it worked, > but > >> > now > >> > it doesn't. > >> > > >> > Any help here would be greatly appreciated. > >> > >> > >> > >> -- > >> Marcelo > > > > -- > Marcelo >