Not quiet sure, but moving the Guava 11 jar to first position in the classpath may solve this issue.
Thanks Best Regards On Tue, Nov 4, 2014 at 1:47 AM, Pala M Muthaia <mchett...@rocketfuelinc.com> wrote: > Thanks Akhil. > > I realized that earlier, and i thought mvn -Phive should have captured and > included all these dependencies. > > In any case, i proceeded with that, included other such dependencies that > were missing, and finally hit the guava version mismatch issue. (Spark > with Guava 14 vs Hadoop/Hive with Guava 11). There are 2 parts: > > 1. Spark includes Guava library within its jars and that may conflict with > Hadoop/Hive components depending on older version of the library. > > It seems this has been solved with SPARK-2848 > <https://issues.apache.org/jira/browse/SPARK-2848> patch to shade the > Guava libraries. > > > 2. Spark actually uses interfaces from newer version of Guava library, > that needs to be rewritten to use older version (i.e. downgrade Spark > dependency on Guava). > > I wasn't able to find the related patches (I need them since i am on Spark > 1.0.1). Applying patch for #1 above, i still hit the following error: > > 14/11/03 15:01:32 WARN storage.BlockManager: Putting block broadcast_0 > failed > java.lang.NoSuchMethodError: > com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode; > at org.apache.spark.util.collection.OpenHashSet.org > $apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261) > at > org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165) > at > org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102) > .... <stack continues> > > I haven't been able to find the other patches that actually downgrade the > dependency. > > > Please point me to those patches, or any other ideas about fixing these > dependency issues. > > > Thanks. > > > > On Sun, Nov 2, 2014 at 8:41 AM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> Adding the libthrift jar >> <http://mvnrepository.com/artifact/org.apache.thrift/libthrift/0.9.0> in >> the class path would resolve this issue. >> >> Thanks >> Best Regards >> >> On Sat, Nov 1, 2014 at 12:34 AM, Pala M Muthaia < >> mchett...@rocketfuelinc.com> wrote: >> >>> Hi, >>> >>> I am trying to load hive datasets using HiveContext, in spark shell. >>> Spark ver 1.0.1 and Hive ver 0.12. >>> >>> We are trying to get Spark work with hive datasets. I already have >>> existing Spark deployment. Following is what i did on top of that: >>> 1. Build spark using 'mvn -Pyarn,hive -Phadoop-2.4 >>> -Dhadoop.version=2.4.0 -DskipTests clean package' >>> 2. Copy over spark-assembly-1.0.1-hadoop2.4.0.jar into spark deployment >>> directory. >>> 3. Launch spark-shell with the spark hive jar included in the list. >>> >>> When i execute *'* >>> >>> *val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)* >>> >>> i get the following error stack: >>> >>> java.lang.NoClassDefFoundError: org/apache/thrift/TBase >>> at java.lang.ClassLoader.defineClass1(Native Method) >>> at java.lang.ClassLoader.defineClass(ClassLoader.java:792) >>> at >>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>> .... >>> at >>> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) >>> at >>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) >>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>> Caused by: java.lang.ClassNotFoundException: org.apache.thrift.TBase >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> ... 55 more >>> >>> I thought that building with -Phive option should include all the >>> necessary hive packages into the assembly jar (according to here >>> <https://spark.apache.org/docs/1.0.1/sql-programming-guide.html#hive-tables>). >>> I tried searching online and in this mailing list archive but haven't found >>> any instructions on how to get this working. >>> >>> I know that there is additional step of updating the assembly jar across >>> the whole cluster, not just client side, but right now, even the client is >>> not working. >>> >>> Would appreciate instructions (or link to them) on how to get this >>> working end-to-end. >>> >>> >>> Thanks, >>> pala >>> >> >> >