It would be good if you can tell what I should add to the documentation to
make it easier to understand. I can update the docs for 1.4.0 release.

On Tue, May 12, 2015 at 9:52 AM, Lee McFadden <splee...@gmail.com> wrote:

> Thanks for explaining Sean and Cody, this makes sense now.  I'd like to
> help improve this documentation so other python users don't run into the
> same thing, so I'll look into that today.
>
> On Tue, May 12, 2015 at 9:44 AM Cody Koeninger <c...@koeninger.org> wrote:
>
>> One of the packages just contains the streaming-kafka code.  The other
>> contains that code, plus everything it depends on.  That's what "assembly"
>> typically means in JVM land.
>>
>> Java/Scala users are accustomed to using their own build tool to include
>> necessary dependencies.  JVM dependency management is (thankfully)
>> different from Python dependency management.
>>
>> As far as I can tell, there is no core issue, upstream or otherwise.
>>
>>
>>
>>
>>
>>
>> On Tue, May 12, 2015 at 11:39 AM, Lee McFadden <splee...@gmail.com>
>> wrote:
>>
>>> Thanks again for all the help folks.
>>>
>>> I can confirm that simply switching to `--packages
>>> org.apache.spark:spark-streaming-kafka-assembly_2.10:1.3.1` makes
>>> everything work as intended.
>>>
>>> I'm not sure what the difference is between the two packages honestly,
>>> or why one should be used over the other, but the documentation is
>>> currently not intuitive in this matter.  If you follow the instructions,
>>> initially it will seem broken.  Is there any reason why the docs for Python
>>> users (or, in fact, all users - Java/Scala users will run into this too
>>> except they are armed with the ability to build their own jar with the
>>> dependencies included) should not be changed to using the assembly package
>>> by default?
>>>
>>> Additionally, after a few google searches yesterday combined with your
>>> help I'm wondering if the core issue is upstream in Kafka's dependency
>>> chain?
>>>
>>> On Tue, May 12, 2015 at 8:53 AM Ted Yu <yuzhih...@gmail.com> wrote:
>>>
>>>> bq. it is already in the assembly
>>>>
>>>> Yes. Verified:
>>>>
>>>> $ jar tvf ~/Downloads/spark-streaming-kafka-assembly_2.10-1.3.1.jar | grep 
>>>> yammer | grep Gauge
>>>>   1329 Sat Apr 11 04:25:50 PDT 2015 com/yammer/metrics/core/Gauge.class
>>>>
>>>>
>>>> On Tue, May 12, 2015 at 8:05 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>
>>>>> It doesn't depend directly on yammer metrics; Kafka does. It wouldn't
>>>>> be correct to declare that it does; it is already in the assembly
>>>>> anyway.
>>>>>
>>>>> On Tue, May 12, 2015 at 3:50 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>> > Currently external/kafka/pom.xml doesn't cite yammer metrics as
>>>>> dependency.
>>>>> >
>>>>> > $ ls -l
>>>>> >
>>>>> ~/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar
>>>>> > -rw-r--r--  1 tyu  staff  82123 Dec 17  2013
>>>>> >
>>>>> /Users/tyu/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar
>>>>> >
>>>>> > Including the metrics-core jar would not increase the size of the
>>>>> final
>>>>> > release artifact much.
>>>>> >
>>>>> > My two cents.
>>>>>
>>>>
>>>>
>>

Reply via email to