In case you still have issues with duplicate files in uber jar, here is a
reference sbt file with assembly plugin that deals with duplicates
https://github.com/databricks/training/blob/sparkSummit2014/streaming/scala/build.sbt
On Fri, Jul 11, 2014 at 10:06 AM, Bill Jay
Alsom the reason the spark-streaming-kafka is not included in the spark
assembly is that we do not want dependencies of external systems like kafka
(which itself probably has a complex dependency tree) to cause conflict
with the core spark's functionality and stability.
TD
On Sun, Jul 13, 2014
Easiest fix would be adding the kafka jars to the SparkContext while
creating it.
Thanks
Best Regards
On Fri, Jul 11, 2014 at 4:39 AM, Dilip dilip_ram...@hotmail.com wrote:
Hi,
I am trying to run a program with spark streaming using Kafka on a stand
alone system. These are my details:
Hi Akhil,
Can you please guide me through this? Because the code I am running
already has this in it:
[java]
SparkContext sc = new SparkContext();
sc.addJar(/usr/local/spark/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.1.0-SNAPSHOT.jar);
Is there something I am
I have met similar issues. The reason is probably because in Spark
assembly, spark-streaming-kafka is not included. Currently, I am using
Maven to generate a shaded package with all the dependencies. You may try
to use sbt assembly to include the dependencies in your jar file.
Bill
On Thu, Jul
A simple
sbt assembly
is not working. Is there any other way to include particular jars with
assembly command?
Regards,
Dilip
On Friday 11 July 2014 12:45 PM, Bill Jay wrote:
I have met similar issues. The reason is probably because in Spark
assembly, spark-streaming-kafka is not
You may try to use this one:
https://github.com/sbt/sbt-assembly
I had an issue of duplicate files in the uber jar file. But I think this
library will assemble dependencies into a single jar file.
Bill
On Fri, Jul 11, 2014 at 1:34 AM, Dilip dilip_ram...@hotmail.com wrote:
A simple
sbt
Hi,
I am trying to run a program with spark streaming using Kafka on a stand
alone system. These are my details:
Spark 1.0.0 hadoop2
Scala 2.10.3
I am trying a simple program using my custom sbt project but this is the
error I am getting:
Exception in thread main