Have you checked the contents of __app__.jar ?


> On Sep 9, 2015, at 3:28 AM, Tom Seddon <mr.tom.sed...@gmail.com> wrote:
> 
> Thanks for your reply Aniket.
> 
> Ok I've done this and I'm still confused.  Output from running locally shows:
> 
> file:/home/tom/spark-avro/target/scala-2.10/simpleapp.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/conf/
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar
> saving text file...
> done!
> 
> In yarn-client mode:
> 
> file:/home/hadoop/simpleapp.jar
> file:/usr/lib/hadoop/hadoop-auth-2.6.0-amzn-0.jar
> ...
> file:/usr/lib/hadoop-mapreduce/avro-1.7.4.jar
> ...
> 
> And in yarn-cluster mode:
> file:/mnt/yarn/usercache/hadoop/appcache/application_1441787021820_0004/container_1441787021820_0004_01_000001/__app__.jar
> ...
> file:/usr/lib/hadoop/lib/avro-1.7.4.jar
> ...
> saving text file...
> done!
> 
> In yarn-cluster mode it doesn't appear to have sight of the fat jar 
> (simpleapp), but can see avro-1.7.4, but runs fine!
> 
> Thanks,
> 
> Tom
> 
> 
>> On Wed, Sep 9, 2015 at 9:49 AM Aniket Bhatnagar <aniket.bhatna...@gmail.com> 
>> wrote:
>> Hi Tom
>> 
>> There has to be a difference in classpaths in yarn-client and yarn-cluster 
>> mode. Perhaps a good starting point would be to print classpath as a first 
>> thing in SimpleApp.main. It should give clues around why it works in 
>> yarn-cluster mode.
>> 
>> Thanks,
>> Aniket
>> 
>> 
>>> On Wed, Sep 9, 2015, 2:11 PM Tom Seddon <mr.tom.sed...@gmail.com> wrote:
>>> Hi,
>>> 
>>> I have a problem trying to get a fairly simple app working which makes use 
>>> of native avro libraries.  The app runs fine on my local machine and in 
>>> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get 
>>> the error below.  I'm aware this is a version problem, as EMR runs an 
>>> earlier version of avro, and I am trying to use avro-1.7.7.
>>> 
>>> What's confusing me a great deal is the fact that this runs fine in 
>>> yarn-cluster mode.
>>> 
>>> What is it about yarn-cluster mode that means the application has access to 
>>> the correct version of the avro library?  I need to run in yarn-client mode 
>>> as I will be caching data to the driver machine in between batches.  I 
>>> think in yarn-cluster mode the driver can run on any machine in the cluster 
>>> so this would not work.
>>> 
>>> Grateful for any advice as I'm really stuck on this.  AWS support are 
>>> trying but they don't seem to know why this is happening either!
>>> 
>>> Just to note, I'm aware of Databricks spark-avro project and have used it.  
>>> This is an investigation to see if I can use RDDs instead of dataframes.
>>> 
>>> java.lang.NoSuchMethodError: 
>>> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema;
>>>     at ophan.thrift.event.Event.<clinit>(Event.java:10)
>>>     at SimpleApp$.main(SimpleApp.scala:25)
>>>     at SimpleApp.main(SimpleApp.scala)
>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>     at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>     at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>     at 
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
>>>     at 
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
>>>     at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
>>>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
>>>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>> 
>>> Thanks,
>>> 
>>> Tom

Reply via email to