Re: Spark and KafkaUtils

Vinti Maheshwari Tue, 15 Mar 2016 13:43:23 -0700

Hi Cody,

I wanted to update my build.sbt which was working with kafka without giving
any error, it may help other user if they face similar issue.


name := "NetworkStreaming"

version := "1.0"

scalaVersion:= "2.10.5"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-streaming-kafka" % "1.6.0",         // kafka
  "org.apache.spark" %% "spark-mllib" % "1.6.0",
  "org.codehaus.groovy" % "groovy-all" % "1.8.6",
  "org.apache.hbase" % "hbase-server" % "1.1.2",
  "org.apache.spark" %% "spark-sql"  % "1.6.0",
  "org.apache.hbase" % "hbase-common" % "1.1.2"
excludeAll(ExclusionRule(organization = "javax.servlet",
name="javax.servlet-api"), ExclusionRule(organization =
"org.mortbay.jetty", name="jetty"), ExclusionRule(organization =
"org.mortbay.jetty", name="servlet-api-2.5")),
  "org.apache.hbase" % "hbase-client" % "1.1.2"
excludeAll(ExclusionRule(organization = "javax.servlet",
name="javax.servlet-api"), ExclusionRule(organization =
"org.mortbay.jetty", name="jetty"), ExclusionRule(organization =
"org.mortbay.jetty", name="servlet-api-2.5"))
)


assemblyMergeStrategy in assembly := {
  case m if m.toLowerCase.endsWith("manifest.mf")          =>
MergeStrategy.discard
  case m if m.toLowerCase.matches("meta-inf.*\\.sf$")      =>
MergeStrategy.discard
  case "log4j.properties"                                  =>
MergeStrategy.discard
  case m if m.toLowerCase.startsWith("meta-inf/services/") =>
MergeStrategy.filterDistinctLines
  case "reference.conf"                                    =>
MergeStrategy.concat
  case _                                                   =>
MergeStrategy.first
}

Thanks & Regards,

Vinti



On Wed, Feb 24, 2016 at 1:34 PM, Cody Koeninger <c...@koeninger.org> wrote:

> Looks like conflicting versions of the same dependency.
> If you look at the mergeStrategy section of the build file I posted, you
> can add additional lines for whatever dependencies are causing issues, e.g.
>
>   case PathList("org", "jboss", "netty", _*) => MergeStrategy.first
>
> On Wed, Feb 24, 2016 at 2:55 PM, Vinti Maheshwari <vinti.u...@gmail.com>
> wrote:
>
>> Thanks much Cody, I added assembly.sbt and modified build.sbt with ivy
>> bug related content.
>>
>> It's giving lots of errors related to ivy:
>>
>> *[error]
>> /Users/vintim/.ivy2/cache/javax.activation/activation/jars/activation-1.1.jar:javax/activation/ActivationDataFlavor.class*
>>
>> Here is complete error log:
>> https://gist.github.com/Vibhuti/07c24d2893fa6e520d4c
>>
>>
>> Regards,
>> ~Vinti
>>
>> On Wed, Feb 24, 2016 at 12:16 PM, Cody Koeninger <c...@koeninger.org>
>> wrote:
>>
>>> Ok, that build file I linked earlier has a minimal example of use.  just
>>> running 'sbt assembly' given a similar build file should build a jar with
>>> all the dependencies.
>>>
>>> On Wed, Feb 24, 2016 at 1:50 PM, Vinti Maheshwari <vinti.u...@gmail.com>
>>> wrote:
>>>
>>>> I am not using sbt assembly currently. I need to check how to use sbt
>>>> assembly.
>>>>
>>>> Regards,
>>>> ~Vinti
>>>>
>>>> On Wed, Feb 24, 2016 at 11:10 AM, Cody Koeninger <c...@koeninger.org>
>>>> wrote:
>>>>
>>>>> Are you using sbt assembly?  That's what will include all of the
>>>>> non-provided dependencies in a single jar along with your code.  Otherwise
>>>>> you'd have to specify each separate jar in your spark-submit line, which 
>>>>> is
>>>>> a pain.
>>>>>
>>>>> On Wed, Feb 24, 2016 at 12:49 PM, Vinti Maheshwari <
>>>>> vinti.u...@gmail.com> wrote:
>>>>>
>>>>>> Hi Cody,
>>>>>>
>>>>>> I tried with the build file you provided, but it's not working for
>>>>>> me, getting same error:
>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> org/apache/spark/streaming/kafka/KafkaUtils$
>>>>>>
>>>>>> I am not getting this error while building  (sbt package). I am
>>>>>> getting this error when i am running my spark-streaming program.
>>>>>> Do i need to specify kafka jar path manually with spark-submit --jars
>>>>>> flag?
>>>>>>
>>>>>> My build.sbt:
>>>>>>
>>>>>> name := "NetworkStreaming"
>>>>>> libraryDependencies += "org.apache.hbase" % "hbase" % "0.92.1"
>>>>>>
>>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-core" % "1.0.2"
>>>>>>
>>>>>> libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % "1.0.0"
>>>>>>
>>>>>> libraryDependencies ++= Seq(
>>>>>>   "org.apache.spark" % "spark-streaming_2.10" % "1.5.2",
>>>>>>   "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.5.2"
>>>>>> )
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> ~Vinti
>>>>>>
>>>>>> On Wed, Feb 24, 2016 at 9:33 AM, Cody Koeninger <c...@koeninger.org>
>>>>>> wrote:
>>>>>>
>>>>>>> spark streaming is provided, kafka is not.
>>>>>>>
>>>>>>> This build file
>>>>>>>
>>>>>>> https://github.com/koeninger/kafka-exactly-once/blob/master/build.sbt
>>>>>>>
>>>>>>> includes some hacks for ivy issues that may no longer be strictly
>>>>>>> necessary, but try that build and see if it works for you.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 24, 2016 at 11:14 AM, Vinti Maheshwari <
>>>>>>> vinti.u...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I have tried multiple different settings in build.sbt but seems
>>>>>>>> like nothing is working.
>>>>>>>> Can anyone suggest the right syntax/way to include kafka with spark?
>>>>>>>>
>>>>>>>> Error
>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>> org/apache/spark/streaming/kafka/KafkaUtils$
>>>>>>>>
>>>>>>>> build.sbt
>>>>>>>> libraryDependencies += "org.apache.hbase" % "hbase" % "0.92.1"
>>>>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-core" % "1.0.2"
>>>>>>>> libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" %
>>>>>>>> "1.0.0"
>>>>>>>> libraryDependencies ++= Seq(
>>>>>>>>   "org.apache.spark" % "spark-streaming_2.10" % "1.5.2",
>>>>>>>>   "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.5.2",
>>>>>>>>   "org.apache.spark" %% "spark-streaming" % "1.5.2" % "provided",
>>>>>>>>   "org.apache.spark" %% "spark-streaming-kafka" % "1.5.2" %
>>>>>>>> "provided"
>>>>>>>> )
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vinti
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Spark and KafkaUtils

Reply via email to