Hi,

We've been having some issues getting spark streaming running correctly
using a Kafka stream, and we've been going around in circles trying to
resolve this dependency.

Details of our environment and the error below, if anyone can help resolve
this it would be much appreciated.

Submit command line:

/home/ubuntu/spark/spark-1.3.1/bin/spark-submit \
    --packages
TargetHolding/pyspark-cassandra:0.1.4,org.apache.spark:spark-streaming-kafka_2.10:1.3.1
\
    --conf
spark.cassandra.connection.host=10.10.103.172,10.10.102.160,10.10.101.79 \
    --master spark://127.0.0.1:7077 \
    affected_hosts.py

When we run the streaming job everything starts just fine, then we see the
following in the logs:

15/05/11 19:50:46 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 70,
ip-10-10-102-53.us-west-2.compute.internal):
java.lang.NoClassDefFoundError: com/yammer/metrics/core/Gauge
        at
kafka.consumer.ZookeeperConsumerConnector.createFetcher(ZookeeperConsumerConnector.scala:151)
        at
kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:115)
        at
kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:128)
        at kafka.consumer.Consumer$.create(ConsumerConnector.scala:89)
        at
org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
        at
org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121)
        at
org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106)
        at
org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:298)
        at
org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:290)
        at
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498)
        at
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498)
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.yammer.metrics.core.Gauge
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 17 more

Reply via email to