Hi Tobias,

Thanks for the suggestion. I have tried to add more nodes from 300 to 400.
It seems the running time did not get improved.


On Wed, Jul 2, 2014 at 6:47 PM, Tobias Pfeiffer <t...@preferred.jp> wrote:

> Bill,
>
> can't you just add more nodes in order to speed up the processing?
>
> Tobias
>
>
> On Thu, Jul 3, 2014 at 7:09 AM, Bill Jay <bill.jaypeter...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I have a problem of using Spark Streaming to accept input data and update
>> a result.
>>
>> The input of the data is from Kafka and the output is to report a map
>> which is updated by historical data in every minute. My current method is
>> to set batch size as 1 minute and use foreachRDD to update this map and
>> output the map at the end of the foreachRDD function. However, the current
>> issue is the processing cannot be finished within one minute.
>>
>> I am thinking of updating the map whenever the new data come instead of
>> doing the update when the whoe RDD comes. Is there any idea on how to
>> achieve this in a better running time? Thanks!
>>
>> Bill
>>
>
>

Reply via email to