Thanks again for all the help folks.

I can confirm that simply switching to `--packages
org.apache.spark:spark-streaming-kafka-assembly_2.10:1.3.1` makes
everything work as intended.

I'm not sure what the difference is between the two packages honestly, or
why one should be used over the other, but the documentation is currently
not intuitive in this matter.  If you follow the instructions, initially it
will seem broken.  Is there any reason why the docs for Python users (or,
in fact, all users - Java/Scala users will run into this too except they
are armed with the ability to build their own jar with the dependencies
included) should not be changed to using the assembly package by default?

Additionally, after a few google searches yesterday combined with your help
I'm wondering if the core issue is upstream in Kafka's dependency chain?

On Tue, May 12, 2015 at 8:53 AM Ted Yu <yuzhih...@gmail.com> wrote:

> bq. it is already in the assembly
>
> Yes. Verified:
>
> $ jar tvf ~/Downloads/spark-streaming-kafka-assembly_2.10-1.3.1.jar | grep 
> yammer | grep Gauge
>   1329 Sat Apr 11 04:25:50 PDT 2015 com/yammer/metrics/core/Gauge.class
>
>
> On Tue, May 12, 2015 at 8:05 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> It doesn't depend directly on yammer metrics; Kafka does. It wouldn't
>> be correct to declare that it does; it is already in the assembly
>> anyway.
>>
>> On Tue, May 12, 2015 at 3:50 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> > Currently external/kafka/pom.xml doesn't cite yammer metrics as
>> dependency.
>> >
>> > $ ls -l
>> >
>> ~/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar
>> > -rw-r--r--  1 tyu  staff  82123 Dec 17  2013
>> >
>> /Users/tyu/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar
>> >
>> > Including the metrics-core jar would not increase the size of the final
>> > release artifact much.
>> >
>> > My two cents.
>>
>
>

Reply via email to