Thanks for your reply Aniket.

Ok I've done this and I'm still confused.  Output from running locally
shows:

file:/home/tom/spark-avro/target/scala-2.10/simpleapp.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/conf/
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar
saving text file...
done!

In yarn-client mode:

file:/home/hadoop/simpleapp.jar
file:/usr/lib/hadoop/hadoop-auth-2.6.0-amzn-0.jar
...
*file:/usr/lib/hadoop-mapreduce/avro-1.7.4.jar*
...

And in yarn-cluster mode:
file:/mnt/yarn/usercache/hadoop/appcache/application_1441787021820_0004/container_1441787021820_0004_01_000001/__app__.jar
...
*file:/usr/lib/hadoop/lib/avro-1.7.4.jar*
...
saving text file...
done!

In yarn-cluster mode it doesn't appear to have sight of the fat jar
(simpleapp), but can see avro-1.7.4, but runs fine!

Thanks,

Tom


On Wed, Sep 9, 2015 at 9:49 AM Aniket Bhatnagar <aniket.bhatna...@gmail.com>
wrote:

> Hi Tom
>
> There has to be a difference in classpaths in yarn-client and yarn-cluster
> mode. Perhaps a good starting point would be to print classpath as a first
> thing in SimpleApp.main. It should give clues around why it works in
> yarn-cluster mode.
>
> Thanks,
> Aniket
>
> On Wed, Sep 9, 2015, 2:11 PM Tom Seddon <mr.tom.sed...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a problem trying to get a fairly simple app working which makes
>> use of native avro libraries.  The app runs fine on my local machine and in
>> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get
>> the error below.  I'm aware this is a version problem, as EMR runs an
>> earlier version of avro, and I am trying to use avro-1.7.7.
>>
>> What's confusing me a great deal is the fact that this runs fine in
>> yarn-cluster mode.
>>
>> What is it about yarn-cluster mode that means the application has access
>> to the correct version of the avro library?  I need to run in yarn-client
>> mode as I will be caching data to the driver machine in between batches.  I
>> think in yarn-cluster mode the driver can run on any machine in the cluster
>> so this would not work.
>>
>> Grateful for any advice as I'm really stuck on this.  AWS support are
>> trying but they don't seem to know why this is happening either!
>>
>> Just to note, I'm aware of Databricks spark-avro project and have used
>> it.  This is an investigation to see if I can use RDDs instead of
>> dataframes.
>>
>> java.lang.NoSuchMethodError:
>> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema;
>> at ophan.thrift.event.Event.<clinit>(Event.java:10)
>> at SimpleApp$.main(SimpleApp.scala:25)
>> at SimpleApp.main(SimpleApp.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> Thanks,
>>
>> Tom
>>
>>
>>

Reply via email to