Hi Tobias, Thanks for the suggestion. I have tried to add more nodes from 300 to 400. It seems the running time did not get improved.
On Wed, Jul 2, 2014 at 6:47 PM, Tobias Pfeiffer <t...@preferred.jp> wrote: > Bill, > > can't you just add more nodes in order to speed up the processing? > > Tobias > > > On Thu, Jul 3, 2014 at 7:09 AM, Bill Jay <bill.jaypeter...@gmail.com> > wrote: > >> Hi all, >> >> I have a problem of using Spark Streaming to accept input data and update >> a result. >> >> The input of the data is from Kafka and the output is to report a map >> which is updated by historical data in every minute. My current method is >> to set batch size as 1 minute and use foreachRDD to update this map and >> output the map at the end of the foreachRDD function. However, the current >> issue is the processing cannot be finished within one minute. >> >> I am thinking of updating the map whenever the new data come instead of >> doing the update when the whoe RDD comes. Is there any idea on how to >> achieve this in a better running time? Thanks! >> >> Bill >> > >