With the help from the Accumulo guys, I probably know why. I'm using the binary distro of Spark and Base64 is from spark-assembly.jar and it probably uses an older version of commons-codec.
I'll need to reinstall spark from source. Jianshi On Mon, Jun 16, 2014 at 9:18 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Hi > > Check in your driver programs Environment, (eg: > http://192.168.1.39:4040/environment/). If you don't see this > commons-codec-1.7.jar jar then that's the issue. > > Thanks > Best Regards > > > On Mon, Jun 16, 2014 at 5:07 PM, Jianshi Huang <jianshi.hu...@gmail.com> > wrote: > >> Hi, >> >> I'm trying to use Accumulo with Spark by writing to AccumuloOutputFormat. >> It went all well on my laptop (Accumulo MockInstance + Spark Local mode). >> >> But when I try to submit it to the yarn cluster, the yarn logs shows the >> following error message: >> >> 14/06/16 02:01:44 INFO cluster.YarnClientClusterScheduler: >> YarnClientClusterScheduler.postStartHook done >> Exception in thread "main" java.lang.NoSuchMethodError: >> org.apache.commons.codec.binary.Base64.encodeBase64String([B)Ljava/lang/String; >> at >> org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.setConnectorInfo(ConfiguratorBase.java:127) >> at >> org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat.setConnectorInfo(AccumuloOutputFormat.java:92) >> at >> com.paypal.rtgraph.demo.MapReduceWriter$.main(MapReduceWriter.scala:44) >> at >> com.paypal.rtgraph.demo.MapReduceWriter.main(MapReduceWriter.scala) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) >> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> >> Looks like Accumulo's dependency has got problems. >> >> Does Anyone know what's wrong with my code or settings? I've added all >> needed jars to spark's classpath. I confirmed that commons-codec-1.7.jar >> has been uploaded to hdfs. >> >> 14/06/16 04:36:02 INFO yarn.Client: Uploading >> file:/x/home/jianshuang/tmp/lib/commons-codec-1.7.jar to >> hdfs://manny-lvs/user/jianshuang/.sparkStaging/application_1401752249873_12662/commons-codec-1.7.jar >> >> >> >> And here's my spark-submit cmd (all JARs needed are concatenated after >> --jars): >> >> ~/spark/spark-1.0.0-bin-hadoop2/bin/spark-submit --name 'rtgraph' --class >> com.paypal.rtgraph.demo.Tables --master yarn --deploy-mode cluster --jars >> `find lib -type f | tr '\n' ','` --driver-memory 4G --driver-cores 4 >> --executor-memory 20G --executor-cores 8 --num-executors 2 rtgraph.jar >> >> I've tried both cluster mode and client mode and neither worked. >> >> >> BTW, I tried to use sbt-assembly to created a bundled jar, however I >> always got the following error: >> >> [error] (*:assembly) deduplicate: different file contents found in the >> following: >> [error] >> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.transaction/orbits/javax.transaction-1.1.1.v201105210645.jar:META-INF/ECLIPSEF.RSA >> [error] >> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.servlet/orbits/javax.servlet-3.0.0.v201112011016.jar:META-INF/ECLIPSEF.RSA >> [error] >> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.mail.glassfish/orbits/javax.mail.glassfish-1.4.1.v201005082020.jar:META-INF/ECLIPSEF.RSA >> [error] >> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.activation/orbits/javax.activation-1.1.0.v201105071233.jar:META-INF/ECLIPSEF.RSA >> >> I googled it and looks like I need to exclude some JARs. Anyone has done >> that? Your help is really appreciated. >> >> >> >> Cheers, >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ >> > > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/