To me this looks like an internal error to the REPL. I am not sure what is causing that. Personally I never use the REPL, can you try typing up your program and running it from an IDE or spark-submit and see if you still get the same error?
Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Mon, Dec 15, 2014 at 4:54 PM, Cristovao Jose Domingues Cordeiro < cristovao.corde...@cern.ch> wrote: > > Sure, thanks: > warning: there were 1 deprecation warning(s); re-run with -deprecation for > details > java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING > at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283) > at org.apache.hadoop.mapreduce.Job.toString(Job.java:462) > at > scala.runtime.ScalaRunTime$.scala$runtime$ScalaRunTime$$inner$1(ScalaRunTime.scala:324) > at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:329) > at scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) > at .<init>(<console>:10) > at .<clinit>(<console>) > at $print(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:846) > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1119) > at > org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:672) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:703) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:667) > at > org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:819) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:864) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:776) > at > org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:619) > at > org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:627) > at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:632) > at > org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:959) > at > org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:907) > at > org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:907) > at > scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:907) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1002) > at org.apache.spark.repl.Main$.main(Main.scala:31) > at org.apache.spark.repl.Main.main(Main.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:331) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > > > > Could something you omitted in your snippet be chaining this exception? > > Cumprimentos / Best regards, > Cristóvão José Domingues Cordeiro > IT Department - 28/R-018 > CERN > ------------------------------ > *From:* Simone Franzini [captainfr...@gmail.com] > *Sent:* 15 December 2014 16:52 > > *To:* Cristovao Jose Domingues Cordeiro > *Subject:* Re: NullPointerException When Reading Avro Sequence Files > > Ok, I have no idea what that is. That appears to be an internal Spark > exception. Maybe if you can post the entire stack trace it would give some > more details to understand what is going on. > > Simone Franzini, PhD > > http://www.linkedin.com/in/simonefranzini > > On Mon, Dec 15, 2014 at 4:50 PM, Cristovao Jose Domingues Cordeiro < > cristovao.corde...@cern.ch> wrote: >> >> Hi, >> >> thanks for that. >> But yeah the 2nd line is an exception. jobread is not created. >> >> Cumprimentos / Best regards, >> Cristóvão José Domingues Cordeiro >> IT Department - 28/R-018 >> CERN >> ------------------------------ >> *From:* Simone Franzini [captainfr...@gmail.com] >> *Sent:* 15 December 2014 16:39 >> >> *To:* Cristovao Jose Domingues Cordeiro >> *Subject:* Re: NullPointerException When Reading Avro Sequence Files >> >> I did not mention the imports needed in my code. I think these are >> all of them: >> >> import org.apache.hadoop.mapreduce.Job >> import org.apache.hadoop.io.NullWritable >> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat >> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat >> import org.apache.hadoop.fs.{ FileSystem, Path } >> import org.apache.avro.{ Schema, SchemaBuilder } >> import org.apache.avro.SchemaBuilder._ >> import org.apache.avro.mapreduce.{ AvroJob, AvroKeyInputFormat, >> AvroKeyOutputFormat } >> import org.apache.avro.mapred.AvroKey >> >> However, what you mentioned is a warning that I think can be ignored. I >> don't see any exception. >> >> Simone Franzini, PhD >> >> http://www.linkedin.com/in/simonefranzini >> >> On Mon, Dec 15, 2014 at 3:10 PM, Cristovao Jose Domingues Cordeiro < >> cristovao.corde...@cern.ch> wrote: >>> >>> Hi Simone, >>> >>> I was finally able to get the chill package, but still, something >>> unrelated which I can not run from your snippet is: >>> val jobread = new Job() >>> >>> I get: >>> warning: there were 1 deprecation warning(s); re-run with -deprecation >>> for details >>> java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING >>> >>> >>> Cumprimentos / Best regards, >>> Cristóvão José Domingues Cordeiro >>> IT Department - 28/R-018 >>> CERN >>> ------------------------------ >>> *From:* Simone Franzini [captainfr...@gmail.com] >>> *Sent:* 09 December 2014 17:06 >>> >>> *To:* Cristovao Jose Domingues Cordeiro; user >>> *Subject:* Re: NullPointerException When Reading Avro Sequence Files >>> >>> You can use this Maven dependency: >>> >>> <dependency> >>> <groupId>com.twitter</groupId> >>> <artifactId>chill-avro</artifactId> >>> <version>0.4.0</version> >>> </dependency> >>> >>> Simone Franzini, PhD >>> >>> http://www.linkedin.com/in/simonefranzini >>> >>> On Tue, Dec 9, 2014 at 9:53 AM, Cristovao Jose Domingues Cordeiro < >>> cristovao.corde...@cern.ch> wrote: >>> >>>> Thanks for the reply! >>>> >>>> I've tried in fact your code. But I lack the twiter chill package and I >>>> can not find it online. So I am now trying this >>>> http://spark.apache.org/docs/latest/tuning.html#data-serialization . >>>> But in case I can't do it, could you tell me where to get that Twiter >>>> package you used? >>>> >>>> Thanks >>>> >>>> Cumprimentos / Best regards, >>>> Cristóvão José Domingues Cordeiro >>>> IT Department - 28/R-018 >>>> CERN >>>> ------------------------------ >>>> *From:* Simone Franzini [captainfr...@gmail.com] >>>> *Sent:* 09 December 2014 16:42 >>>> *To:* Cristovao Jose Domingues Cordeiro; user >>>> >>>> *Subject:* Re: NullPointerException When Reading Avro Sequence Files >>>> >>>> Hi Cristovao, >>>> >>>> I have seen a very similar issue that I have posted about in this >>>> thread: >>>> >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Kryo-NPE-with-Array-td19797.html >>>> I think your main issue here is somewhat similar, in that the >>>> MapWrapper Scala class is not registered. This gets registered by the >>>> Twitter chill-scala AllScalaRegistrar class that you are currently not >>>> using. >>>> >>>> As far as I understand, in order to use Avro with Spark, you also >>>> have to use Kryo. This means you have to use the Spark KryoSerializer. This >>>> in turn uses Twitter chill. I posted the basic code that I am using here: >>>> >>>> >>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-read-this-avro-file-using-spark-amp-scala-td19400.html#a19491 >>>> >>>> Maybe there is a simpler solution to your problem but I am not that >>>> much of an expert yet. I hope this helps. >>>> >>>> Simone Franzini, PhD >>>> >>>> http://www.linkedin.com/in/simonefranzini >>>> >>>> On Tue, Dec 9, 2014 at 8:50 AM, Cristovao Jose Domingues Cordeiro < >>>> cristovao.corde...@cern.ch> wrote: >>>> >>>>> Hi Simone, >>>>> >>>>> thanks but I don't think that's it. >>>>> I've tried several libraries within the --jar argument. Some do give >>>>> what you said. But other times (when I put the right version I guess) I >>>>> get >>>>> the following: >>>>> 14/12/09 15:45:54 ERROR Executor: Exception in task 0.0 in stage 0.0 >>>>> (TID 0) >>>>> java.io.NotSerializableException: >>>>> scala.collection.convert.Wrappers$MapWrapper >>>>> at >>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) >>>>> at >>>>> java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1377) >>>>> >>>>> >>>>> Which is odd since I am reading a Avro I wrote...with the same piece >>>>> of code: >>>>> https://gist.github.com/MLnick/5864741781b9340cb211 >>>>> >>>>> Cumprimentos / Best regards, >>>>> Cristóvão José Domingues Cordeiro >>>>> IT Department - 28/R-018 >>>>> CERN >>>>> ------------------------------ >>>>> *From:* Simone Franzini [captainfr...@gmail.com] >>>>> *Sent:* 06 December 2014 15:48 >>>>> *To:* Cristovao Jose Domingues Cordeiro >>>>> *Subject:* Re: NullPointerException When Reading Avro Sequence Files >>>>> >>>>> java.lang.IncompatibleClassChangeError: Found interface >>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected >>>>> >>>>> That is a sign that you are mixing up versions of Hadoop. This is >>>>> particularly an issue when dealing with AVRO. If you are using Hadoop 2, >>>>> you will need to get the hadoop 2 version of avro-mapred. In Maven you >>>>> would do this with the <classifier> hadoop2 </classifier> tag. >>>>> >>>>> Simone Franzini, PhD >>>>> >>>>> http://www.linkedin.com/in/simonefranzini >>>>> >>>>> On Fri, Dec 5, 2014 at 3:52 AM, cjdc <cristovao.corde...@cern.ch> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I've tried the above example on Gist, but it doesn't work (at least >>>>>> for me). >>>>>> Did anyone get this: >>>>>> 14/12/05 10:44:40 ERROR Executor: Exception in task 0.0 in stage 0.0 >>>>>> (TID 0) >>>>>> java.lang.IncompatibleClassChangeError: Found interface >>>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected >>>>>> at >>>>>> >>>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47) >>>>>> at >>>>>> >>>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:115) >>>>>> at >>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:103) >>>>>> at >>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65) >>>>>> at >>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) >>>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) >>>>>> at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) >>>>>> at >>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) >>>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) >>>>>> at >>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) >>>>>> at org.apache.spark.scheduler.Task.run(Task.scala:54) >>>>>> at >>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) >>>>>> at >>>>>> >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>> at >>>>>> >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> 14/12/05 10:44:40 ERROR ExecutorUncaughtExceptionHandler: Uncaught >>>>>> exception >>>>>> in thread Thread[Executor task launch worker-0,5,main] >>>>>> java.lang.IncompatibleClassChangeError: Found interface >>>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected >>>>>> at >>>>>> >>>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47) >>>>>> at >>>>>> >>>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:115) >>>>>> at >>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:103) >>>>>> at >>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65) >>>>>> at >>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) >>>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) >>>>>> at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) >>>>>> at >>>>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) >>>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) >>>>>> at >>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) >>>>>> at org.apache.spark.scheduler.Task.run(Task.scala:54) >>>>>> at >>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) >>>>>> at >>>>>> >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>> at >>>>>> >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> 14/12/05 10:44:40 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 >>>>>> times; >>>>>> aborting job >>>>>> >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-when-reading-Avro-Sequence-files-tp10201p20456.html >>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>>> >>>