Bill,

are you saying, after repartition(400), you have 400 partitions on one host
and the other hosts receive nothing of the data?

Tobias


On Fri, Jul 18, 2014 at 8:11 AM, Bill Jay <bill.jaypeter...@gmail.com>
wrote:

> I also have an issue consuming from Kafka. When I consume from Kafka,
> there are always a single executor working on this job. Even I use
> repartition, it seems that there is still a single executor. Does anyone
> has an idea how to add parallelism to this job?
>
>
>
> On Thu, Jul 17, 2014 at 2:06 PM, Chen Song <chen.song...@gmail.com> wrote:
>
>> Thanks Luis and Tobias.
>>
>>
>> On Tue, Jul 1, 2014 at 11:39 PM, Tobias Pfeiffer <t...@preferred.jp>
>> wrote:
>>
>>> Hi,
>>>
>>> On Wed, Jul 2, 2014 at 1:57 AM, Chen Song <chen.song...@gmail.com>
>>> wrote:
>>>>
>>>> * Is there a way to control how far Kafka Dstream can read on
>>>> topic-partition (via offset for example). By setting this to a small
>>>> number, it will force DStream to read less data initially.
>>>>
>>>
>>> Please see the post at
>>>
>>> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201406.mbox/%3ccaph-c_m2ppurjx-n_tehh0bvqe_6la-rvgtrf1k-lwrmme+...@mail.gmail.com%3E
>>> Kafka's auto.offset.reset parameter may be what you are looking for.
>>>
>>> Tobias
>>>
>>>
>>
>>
>> --
>> Chen Song
>>
>>
>

Reply via email to