Hi,
I am using spark streaming for processing messages from kafka for real time 
analytics. I am trying to fine tune my streaming process. Currently my spark 
streaming system is reading a batch of messages from kafka topic and processing 
each message one at a time. I have set properties in spark streaming for 
increasing parallelism for tasks it performs to process that one message.
               The problem here is that the processing of one message picked up 
in a batch is still taking lot of time as my workflow is like that. What I 
would like to implement is a way in which other messages picked up in that 
batch can be sent for processing in parallel along with that first message. 
This scenario will reduce the overall processing time as some messages may take 
time and some may not and others are not waiting for the one message to process.
               Is this kind of implementation possible with spark streaming ? 
If not then do I need to use some other tool along with spark streaming to 
include this kind of processing ? What are the possible options for me?
Thanks in advance.

Thanks,
Udbhav Agarwal


Reply via email to