In case you still have issues with duplicate files in uber jar, here is a reference sbt file with assembly plugin that deals with duplicates
https://github.com/databricks/training/blob/sparkSummit2014/streaming/scala/build.sbt On Fri, Jul 11, 2014 at 10:06 AM, Bill Jay <bill.jaypeter...@gmail.com> wrote: > You may try to use this one: > > https://github.com/sbt/sbt-assembly > > I had an issue of duplicate files in the uber jar file. But I think this > library will assemble dependencies into a single jar file. > > Bill > > > On Fri, Jul 11, 2014 at 1:34 AM, Dilip <dilip_ram...@hotmail.com> wrote: > >> A simple >> sbt assembly >> is not working. Is there any other way to include particular jars with >> assembly command? >> >> Regards, >> Dilip >> >> On Friday 11 July 2014 12:45 PM, Bill Jay wrote: >> >> I have met similar issues. The reason is probably because in Spark >> assembly, spark-streaming-kafka is not included. Currently, I am using >> Maven to generate a shaded package with all the dependencies. You may try >> to use sbt assembly to include the dependencies in your jar file. >> >> Bill >> >> >> On Thu, Jul 10, 2014 at 11:48 PM, Dilip <dilip_ram...@hotmail.com> wrote: >> >>> Hi Akhil, >>> >>> Can you please guide me through this? Because the code I am running >>> already has this in it: >>> [java] >>> >>> SparkContext sc = new SparkContext(); >>> >>> sc.addJar("/usr/local/spark/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.1.0-SNAPSHOT.jar"); >>> >>> >>> Is there something I am missing? >>> >>> Thanks, >>> Dilip >>> >>> >>> On Friday 11 July 2014 12:02 PM, Akhil Das wrote: >>> >>> Easiest fix would be adding the kafka jars to the SparkContext while >>> creating it. >>> >>> Thanks >>> Best Regards >>> >>> >>> On Fri, Jul 11, 2014 at 4:39 AM, Dilip <dilip_ram...@hotmail.com> wrote: >>> >>>> Hi, >>>> >>>> I am trying to run a program with spark streaming using Kafka on a >>>> stand alone system. These are my details: >>>> >>>> Spark 1.0.0 hadoop2 >>>> Scala 2.10.3 >>>> >>>> I am trying a simple program using my custom sbt project but this is >>>> the error I am getting: >>>> >>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>> kafka/serializer/StringDecoder >>>> at >>>> org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:55) >>>> at >>>> org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:94) >>>> at >>>> org.apache.spark.streaming.kafka.KafkaUtils.createStream(KafkaUtils.scala) >>>> at SimpleJavaApp.main(SimpleJavaApp.java:40) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>> at >>>> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) >>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) >>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>>> Caused by: java.lang.ClassNotFoundException: >>>> kafka.serializer.StringDecoder >>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>>> ... 11 more >>>> >>>> >>>> here is my .sbt file: >>>> >>>> name := "Simple Project" >>>> >>>> version := "1.0" >>>> >>>> scalaVersion := "2.10.3" >>>> >>>> libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0" >>>> >>>> libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.0.0" >>>> >>>> libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.0.0" >>>> >>>> libraryDependencies += "org.apache.spark" %% "spark-examples" % "1.0.0" >>>> >>>> libraryDependencies += "org.apache.spark" % >>>> "spark-streaming-kafka_2.10" % "1.0.0" >>>> >>>> libraryDependencies += "org.apache.kafka" %% "kafka" % "0.8.0" >>>> >>>> resolvers += "Akka Repository" at "http://repo.akka.io/releases/" >>>> >>>> resolvers += "Maven Repository" at "http://central.maven.org/maven2/" >>>> >>>> >>>> sbt package was successful. I also tried sbt "++2.10.3 package" to >>>> build it for my scala version. Problem remains the same. >>>> Can anyone help me out here? Ive been stuck on this for quite some time >>>> now. >>>> >>>> Thank You, >>>> Dilip >>>> >>> >>> >>> >> >> >