Hi all,

I have a problem of using Spark Streaming to accept input data and update a
result.

The input of the data is from Kafka and the output is to report a map which
is updated by historical data in every minute. My current method is to set
batch size as 1 minute and use foreachRDD to update this map and output the
map at the end of the foreachRDD function. However, the current issue is
the processing cannot be finished within one minute.

I am thinking of updating the map whenever the new data come instead of
doing the update when the whoe RDD comes. Is there any idea on how to
achieve this in a better running time? Thanks!

Bill

Reply via email to