It throws exception for WriteAheadLogUtils after excluding core and streaming jar.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/util/WriteAheadLogUtils$ at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:84) at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:65) at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:103) at org.apache.spark.streaming.kafka.KafkaUtils.createStream(KafkaUtils.scala) at com.adobe.hadoop.saprk.sample.SampleSparkStreamApp.main(SampleSparkStreamApp.java:25) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) pom.xml is : <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>xxxx</groupId> <artifactId>SampleSparkStreamApp</artifactId> <version>1.0</version> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.2.0</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming-kafka_2.10</artifactId> <version>1.4.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_2.10</artifactId> <scope>provided</scope> <version>1.4.0</version> </dependency> </dependencies> <build> <plugins> <!-- any other plugins --> <plugin> <artifactId>maven-assembly-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> </plugin> </plugins> </build> </project> And when I pass streaming jar using --jar option , it threw same java.lang.NoClassDefFoundError: org/apache/spark/util/ThreadUtils$. Thanks On Wed, Jun 24, 2015 at 1:17 AM, Tathagata Das <t...@databricks.com> wrote: > You must not include spark-core and spark-streaming in the assembly. They > are already present in the installation and the presence of multiple > versions of spark may throw off the classloaders in weird ways. So make the > assembly marking the those dependencies as scope=provided. > > > > On Tue, Jun 23, 2015 at 11:56 AM, Shushant Arora < > shushantaror...@gmail.com> wrote: > >> hi >> >> While using spark streaming (1.2) with kafka . I am getting below error >> and receivers are getting killed but jobs get scheduled at each stream >> interval. >> >> 15/06/23 18:42:35 WARN TaskSetManager: Lost task 0.1 in stage 18.0 (TID >> 82, ip(XXXXXX)): java.io.IOException: Failed to connect to ip(XXXXXXXX) >> at >> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191) >> at >> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156) >> at >> org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78) >> at >> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) >> at >> org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43) >> at >> org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> >> >> 15/06/23 18:42:36 ERROR ReceiverTracker: Deregistered receiver for stream >> 0: Error starting receiver 0 - java.lang.NoClassDefFoundError: >> org/apache/spark/util/ThreadUtils$ >> at >> org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:115) >> at >> org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) >> at >> org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) >> at >> org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:277) >> at >> org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:269) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1319) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1319) >> at >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:56) >> at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> >> >> I created jar with include all dependencies. Which jar is missing here ? >> >> >> >> >> >