Thanks Ted, The issue is that I'm using packages (see spark-submit definition) and I do not know how to add com.yammer.metrics:metrics-core to my classpath so Spark can see it.
Should metrics-core not be part of the org.apache.spark:spark-streaming-kafka_2.10:1.3.1 package so it can work correctly? If not, any clues as to how I can add metrics-core to my project (bearing in mind that I'm using Python, not a JVM language) would be much appreciated. Thanks, and apologies for my newbness with Java/Scala. On Mon, May 11, 2015 at 1:42 PM Ted Yu <yuzhih...@gmail.com> wrote: > com.yammer.metrics.core.Gauge is in metrics-core jar > e.g., in master branch: > [INFO] | \- org.apache.kafka:kafka_2.10:jar:0.8.1.1:compile > [INFO] | +- com.yammer.metrics:metrics-core:jar:2.2.0:compile > > Please make sure metrics-core jar is on the classpath. > > On Mon, May 11, 2015 at 1:32 PM, Lee McFadden <splee...@gmail.com> wrote: > >> Hi, >> >> We've been having some issues getting spark streaming running correctly >> using a Kafka stream, and we've been going around in circles trying to >> resolve this dependency. >> >> Details of our environment and the error below, if anyone can help >> resolve this it would be much appreciated. >> >> Submit command line: >> >> /home/ubuntu/spark/spark-1.3.1/bin/spark-submit \ >> --packages >> TargetHolding/pyspark-cassandra:0.1.4,org.apache.spark:spark-streaming-kafka_2.10:1.3.1 >> \ >> --conf >> spark.cassandra.connection.host=10.10.103.172,10.10.102.160,10.10.101.79 \ >> --master spark://127.0.0.1:7077 \ >> affected_hosts.py >> >> When we run the streaming job everything starts just fine, then we see >> the following in the logs: >> >> 15/05/11 19:50:46 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID >> 70, ip-10-10-102-53.us-west-2.compute.internal): >> java.lang.NoClassDefFoundError: com/yammer/metrics/core/Gauge >> at >> kafka.consumer.ZookeeperConsumerConnector.createFetcher(ZookeeperConsumerConnector.scala:151) >> at >> kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:115) >> at >> kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:128) >> at kafka.consumer.Consumer$.create(ConsumerConnector.scala:89) >> at >> org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) >> at >> org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) >> at >> org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) >> at >> org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:298) >> at >> org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:290) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498) >> at >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.ClassNotFoundException: com.yammer.metrics.core.Gauge >> at java.net.URLClassLoader$1.run(URLClassLoader.java:372) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:360) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> ... 17 more >> >> >> >