You must not include spark-core and spark-streaming in the assembly. They are already present in the installation and the presence of multiple versions of spark may throw off the classloaders in weird ways. So make the assembly marking the those dependencies as scope=provided.
On Tue, Jun 23, 2015 at 11:56 AM, Shushant Arora <[email protected]> wrote: > hi > > While using spark streaming (1.2) with kafka . I am getting below error > and receivers are getting killed but jobs get scheduled at each stream > interval. > > 15/06/23 18:42:35 WARN TaskSetManager: Lost task 0.1 in stage 18.0 (TID > 82, ip(XXXXXX)): java.io.IOException: Failed to connect to ip(XXXXXXXX) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191) > at > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156) > at > org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78) > at > org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) > at > org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43) > at > org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > > 15/06/23 18:42:36 ERROR ReceiverTracker: Deregistered receiver for stream > 0: Error starting receiver 0 - java.lang.NoClassDefFoundError: > org/apache/spark/util/ThreadUtils$ > at > org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:115) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:277) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:269) > at > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1319) > at > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1319) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > > I created jar with include all dependencies. Which jar is missing here ? > > > > >
