Have you checked the contents of __app__.jar ?
> On Sep 9, 2015, at 3:28 AM, Tom Seddon <mr.tom.sed...@gmail.com> wrote: > > Thanks for your reply Aniket. > > Ok I've done this and I'm still confused. Output from running locally shows: > > file:/home/tom/spark-avro/target/scala-2.10/simpleapp.jar > file:/home/tom/spark-1.4.0-bin-hadoop2.4/conf/ > file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar > file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar > file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar > file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar > file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar > file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar > file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar > file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar > file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar > file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar > saving text file... > done! > > In yarn-client mode: > > file:/home/hadoop/simpleapp.jar > file:/usr/lib/hadoop/hadoop-auth-2.6.0-amzn-0.jar > ... > file:/usr/lib/hadoop-mapreduce/avro-1.7.4.jar > ... > > And in yarn-cluster mode: > file:/mnt/yarn/usercache/hadoop/appcache/application_1441787021820_0004/container_1441787021820_0004_01_000001/__app__.jar > ... > file:/usr/lib/hadoop/lib/avro-1.7.4.jar > ... > saving text file... > done! > > In yarn-cluster mode it doesn't appear to have sight of the fat jar > (simpleapp), but can see avro-1.7.4, but runs fine! > > Thanks, > > Tom > > >> On Wed, Sep 9, 2015 at 9:49 AM Aniket Bhatnagar <aniket.bhatna...@gmail.com> >> wrote: >> Hi Tom >> >> There has to be a difference in classpaths in yarn-client and yarn-cluster >> mode. Perhaps a good starting point would be to print classpath as a first >> thing in SimpleApp.main. It should give clues around why it works in >> yarn-cluster mode. >> >> Thanks, >> Aniket >> >> >>> On Wed, Sep 9, 2015, 2:11 PM Tom Seddon <mr.tom.sed...@gmail.com> wrote: >>> Hi, >>> >>> I have a problem trying to get a fairly simple app working which makes use >>> of native avro libraries. The app runs fine on my local machine and in >>> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get >>> the error below. I'm aware this is a version problem, as EMR runs an >>> earlier version of avro, and I am trying to use avro-1.7.7. >>> >>> What's confusing me a great deal is the fact that this runs fine in >>> yarn-cluster mode. >>> >>> What is it about yarn-cluster mode that means the application has access to >>> the correct version of the avro library? I need to run in yarn-client mode >>> as I will be caching data to the driver machine in between batches. I >>> think in yarn-cluster mode the driver can run on any machine in the cluster >>> so this would not work. >>> >>> Grateful for any advice as I'm really stuck on this. AWS support are >>> trying but they don't seem to know why this is happening either! >>> >>> Just to note, I'm aware of Databricks spark-avro project and have used it. >>> This is an investigation to see if I can use RDDs instead of dataframes. >>> >>> java.lang.NoSuchMethodError: >>> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema; >>> at ophan.thrift.event.Event.<clinit>(Event.java:10) >>> at SimpleApp$.main(SimpleApp.scala:25) >>> at SimpleApp.main(SimpleApp.scala) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665) >>> at >>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170) >>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193) >>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112) >>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>> >>> Thanks, >>> >>> Tom