Hi all, I have a problem of using Spark Streaming to accept input data and update a result.
The input of the data is from Kafka and the output is to report a map which is updated by historical data in every minute. My current method is to set batch size as 1 minute and use foreachRDD to update this map and output the map at the end of the foreachRDD function. However, the current issue is the processing cannot be finished within one minute. I am thinking of updating the map whenever the new data come instead of doing the update when the whoe RDD comes. Is there any idea on how to achieve this in a better running time? Thanks! Bill