Hello, I am trying to figure out why my kafka+spark job is running slow. I found that spark is consuming all the messages out of kafka into a single batch itself and not sending any messages to the other batches.
2016/03/05 21:57:05 <http://ttsv-lab-vmdb-02.englab.juniper.net:8088/proxy/application_1457242523248_0003/streaming/batch?id=1457243825000> 0 events - - queued 2016/03/05 21:57:00 <http://ttsv-lab-vmdb-02.englab.juniper.net:8088/proxy/application_1457242523248_0003/streaming/batch?id=1457243820000> 0 events - - queued 2016/03/05 21:56:55 <http://ttsv-lab-vmdb-02.englab.juniper.net:8088/proxy/application_1457242523248_0003/streaming/batch?id=1457243815000> 0 events - - queued 2016/03/05 21:56:50 <http://ttsv-lab-vmdb-02.englab.juniper.net:8088/proxy/application_1457242523248_0003/streaming/batch?id=1457243810000> 0 events - - queued 2016/03/05 21:56:45 <http://ttsv-lab-vmdb-02.englab.juniper.net:8088/proxy/application_1457242523248_0003/streaming/batch?id=1457243805000> 0 events - - queued 2016/03/05 21:56:40 <http://ttsv-lab-vmdb-02.englab.juniper.net:8088/proxy/application_1457242523248_0003/streaming/batch?id=1457243800000> 4039573 events 6 ms - processing Does anyone know how this behavior can be changed so that the number of messages are load balanced across all the batches? Thanks, Vinti