Hi Cody, I wanted to update my build.sbt which was working with kafka without giving any error, it may help other user if they face similar issue.
name := "NetworkStreaming" version := "1.0" scalaVersion:= "2.10.5" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-streaming-kafka" % "1.6.0", // kafka "org.apache.spark" %% "spark-mllib" % "1.6.0", "org.codehaus.groovy" % "groovy-all" % "1.8.6", "org.apache.hbase" % "hbase-server" % "1.1.2", "org.apache.spark" %% "spark-sql" % "1.6.0", "org.apache.hbase" % "hbase-common" % "1.1.2" excludeAll(ExclusionRule(organization = "javax.servlet", name="javax.servlet-api"), ExclusionRule(organization = "org.mortbay.jetty", name="jetty"), ExclusionRule(organization = "org.mortbay.jetty", name="servlet-api-2.5")), "org.apache.hbase" % "hbase-client" % "1.1.2" excludeAll(ExclusionRule(organization = "javax.servlet", name="javax.servlet-api"), ExclusionRule(organization = "org.mortbay.jetty", name="jetty"), ExclusionRule(organization = "org.mortbay.jetty", name="servlet-api-2.5")) ) assemblyMergeStrategy in assembly := { case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => MergeStrategy.discard case "log4j.properties" => MergeStrategy.discard case m if m.toLowerCase.startsWith("meta-inf/services/") => MergeStrategy.filterDistinctLines case "reference.conf" => MergeStrategy.concat case _ => MergeStrategy.first } Thanks & Regards, Vinti On Wed, Feb 24, 2016 at 1:34 PM, Cody Koeninger <c...@koeninger.org> wrote: > Looks like conflicting versions of the same dependency. > If you look at the mergeStrategy section of the build file I posted, you > can add additional lines for whatever dependencies are causing issues, e.g. > > case PathList("org", "jboss", "netty", _*) => MergeStrategy.first > > On Wed, Feb 24, 2016 at 2:55 PM, Vinti Maheshwari <vinti.u...@gmail.com> > wrote: > >> Thanks much Cody, I added assembly.sbt and modified build.sbt with ivy >> bug related content. >> >> It's giving lots of errors related to ivy: >> >> *[error] >> /Users/vintim/.ivy2/cache/javax.activation/activation/jars/activation-1.1.jar:javax/activation/ActivationDataFlavor.class* >> >> Here is complete error log: >> https://gist.github.com/Vibhuti/07c24d2893fa6e520d4c >> >> >> Regards, >> ~Vinti >> >> On Wed, Feb 24, 2016 at 12:16 PM, Cody Koeninger <c...@koeninger.org> >> wrote: >> >>> Ok, that build file I linked earlier has a minimal example of use. just >>> running 'sbt assembly' given a similar build file should build a jar with >>> all the dependencies. >>> >>> On Wed, Feb 24, 2016 at 1:50 PM, Vinti Maheshwari <vinti.u...@gmail.com> >>> wrote: >>> >>>> I am not using sbt assembly currently. I need to check how to use sbt >>>> assembly. >>>> >>>> Regards, >>>> ~Vinti >>>> >>>> On Wed, Feb 24, 2016 at 11:10 AM, Cody Koeninger <c...@koeninger.org> >>>> wrote: >>>> >>>>> Are you using sbt assembly? That's what will include all of the >>>>> non-provided dependencies in a single jar along with your code. Otherwise >>>>> you'd have to specify each separate jar in your spark-submit line, which >>>>> is >>>>> a pain. >>>>> >>>>> On Wed, Feb 24, 2016 at 12:49 PM, Vinti Maheshwari < >>>>> vinti.u...@gmail.com> wrote: >>>>> >>>>>> Hi Cody, >>>>>> >>>>>> I tried with the build file you provided, but it's not working for >>>>>> me, getting same error: >>>>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>> org/apache/spark/streaming/kafka/KafkaUtils$ >>>>>> >>>>>> I am not getting this error while building (sbt package). I am >>>>>> getting this error when i am running my spark-streaming program. >>>>>> Do i need to specify kafka jar path manually with spark-submit --jars >>>>>> flag? >>>>>> >>>>>> My build.sbt: >>>>>> >>>>>> name := "NetworkStreaming" >>>>>> libraryDependencies += "org.apache.hbase" % "hbase" % "0.92.1" >>>>>> >>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-core" % "1.0.2" >>>>>> >>>>>> libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % "1.0.0" >>>>>> >>>>>> libraryDependencies ++= Seq( >>>>>> "org.apache.spark" % "spark-streaming_2.10" % "1.5.2", >>>>>> "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.5.2" >>>>>> ) >>>>>> >>>>>> >>>>>> >>>>>> Regards, >>>>>> ~Vinti >>>>>> >>>>>> On Wed, Feb 24, 2016 at 9:33 AM, Cody Koeninger <c...@koeninger.org> >>>>>> wrote: >>>>>> >>>>>>> spark streaming is provided, kafka is not. >>>>>>> >>>>>>> This build file >>>>>>> >>>>>>> https://github.com/koeninger/kafka-exactly-once/blob/master/build.sbt >>>>>>> >>>>>>> includes some hacks for ivy issues that may no longer be strictly >>>>>>> necessary, but try that build and see if it works for you. >>>>>>> >>>>>>> >>>>>>> On Wed, Feb 24, 2016 at 11:14 AM, Vinti Maheshwari < >>>>>>> vinti.u...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I have tried multiple different settings in build.sbt but seems >>>>>>>> like nothing is working. >>>>>>>> Can anyone suggest the right syntax/way to include kafka with spark? >>>>>>>> >>>>>>>> Error >>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>>>> org/apache/spark/streaming/kafka/KafkaUtils$ >>>>>>>> >>>>>>>> build.sbt >>>>>>>> libraryDependencies += "org.apache.hbase" % "hbase" % "0.92.1" >>>>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-core" % "1.0.2" >>>>>>>> libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % >>>>>>>> "1.0.0" >>>>>>>> libraryDependencies ++= Seq( >>>>>>>> "org.apache.spark" % "spark-streaming_2.10" % "1.5.2", >>>>>>>> "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.5.2", >>>>>>>> "org.apache.spark" %% "spark-streaming" % "1.5.2" % "provided", >>>>>>>> "org.apache.spark" %% "spark-streaming-kafka" % "1.5.2" % >>>>>>>> "provided" >>>>>>>> ) >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vinti >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >