Oh, are you actually bundling Hadoop in your app? that may be the problem. If you're using stand-alone mode, why include Hadoop? In any event, Spark and Hadoop are intended to be 'provided' dependencies in the app you send to spark-submit.
On Tue, Jan 6, 2015 at 10:15 AM, Niranda Perera <niranda.per...@gmail.com> wrote: > Hi Sean, > > My mistake, Guava 11 dependency came from the hadoop-commons indeed. > > I'm running the following simple app in spark 1.2.0 standalone local > cluster (2 workers) with Hadoop 1.2.1 > > public class AvroSparkTest { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setMaster("spark://niranda-ThinkPad-T540p:7077") > //("local[2]") > .setAppName("avro-spark-test"); > > JavaSparkContext sparkContext = new JavaSparkContext(sparkConf); > JavaSQLContext sqlContext = new JavaSQLContext(sparkContext); > JavaSchemaRDD episodes = AvroUtils.avroFile(sqlContext, > > "/home/niranda/projects/avro-spark-test/src/test/resources/episodes.avro"); > episodes.printSchema(); > episodes.registerTempTable("avroTable"); > List<Row> result = sqlContext.sql("SELECT * FROM > avroTable").collect(); > > for (Row row : result) { > System.out.println(row.toString()); > } > } > } > > As you pointed out, this error occurs while adding the hadoop dependency. > this runs without a problem when the hadoop dependency is removed and the > master is set to local[]. > > Cheers > > On Tue, Jan 6, 2015 at 3:23 PM, Sean Owen <so...@cloudera.com> wrote: > >> -dev >> >> Guava was not downgraded to 11. That PR was not merged. It was part of a >> discussion about, indeed, what to do about potential Guava version >> conflicts. Spark uses Guava, but so does Hadoop, and so do user programs. >> >> Spark uses 14.0.1 in fact: >> https://github.com/apache/spark/blob/master/pom.xml#L330 >> >> This is a symptom of conflict between Spark's Guava 14 and Hadoop's Guava >> 11. See for example https://issues.apache.org/jira/browse/HIVE-7387 as >> well. >> >> Guava is now shaded in Spark as of 1.2.0 (and 1.1.x?), so I would think a >> lot of these problems are solved. As we've seen though, this one is tricky. >> >> What's your Spark version? and what are you executing? what mode -- >> standalone, YARN? What Hadoop version? >> >> >> On Tue, Jan 6, 2015 at 8:38 AM, Niranda Perera <niranda.per...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I have been running a simple Spark app on a local spark cluster and I >>> came across this error. >>> >>> Exception in thread "main" java.lang.NoSuchMethodError: >>> com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode; >>> at org.apache.spark.util.collection.OpenHashSet.org >>> $apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261) >>> at >>> org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165) >>> at >>> org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102) >>> at >>> org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214) >>> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) >>> at >>> org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210) >>> at >>> org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169) >>> at >>> org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161) >>> at >>> org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155) >>> at >>> org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78) >>> at >>> org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70) >>> at >>> org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31) >>> at >>> org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:249) >>> at >>> org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:136) >>> at >>> org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:114) >>> at >>> org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787) >>> at >>> org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638) >>> at >>> org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:992) >>> at >>> org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:98) >>> at >>> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:84) >>> at >>> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) >>> at >>> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29) >>> at >>> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62) >>> at org.apache.spark.SparkContext.broadcast(SparkContext.scala:945) >>> at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:695) >>> at >>> com.databricks.spark.avro.AvroRelation.buildScan$lzycompute(AvroRelation.scala:45) >>> at >>> com.databricks.spark.avro.AvroRelation.buildScan(AvroRelation.scala:44) >>> at >>> org.apache.spark.sql.sources.DataSourceStrategy$.apply(DataSourceStrategy.scala:56) >>> at >>> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) >>> at >>> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) >>> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) >>> at >>> org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59) >>> at >>> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418) >>> at >>> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416) >>> at >>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422) >>> at >>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422) >>> at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:444) >>> at >>> org.apache.spark.sql.api.java.JavaSchemaRDD.collect(JavaSchemaRDD.scala:114) >>> >>> >>> While looking into this I found out that Guava was downgraded to version >>> 11 in this PR. >>> https://github.com/apache/spark/pull/1610 >>> >>> In this PR OpenHashSet.scala:261 line hashInt has been changed to >>> hashLong. >>> But when I actually run my app, "java.lang.NoSuchMethodError: >>> com.google.common.hash.HashFunction.hashInt" error occurs, >>> which is understandable because hashInt is not available before Guava 12. >>> >>> So, I''m wondering why this occurs? >>> >>> Cheers >>> -- >>> Niranda Perera >>> >>> >> > > > -- > Niranda >