Ok Andrew, Thanks I sent informations of test with 8 worker and the gap is grown up.
On May 4, 2014, at 2:31, Andrew Ash <and...@andrewash.com> wrote: >>> From the logs, I see that the print() starts printing stuff 10 seconds >>> after the context is started. And that 10 seconds is taken by the initial >>> empty job (50 map + 20 reduce tasks) that spark streaming starts to ensure >>> all the executors have started. Somehow the first empty task takes 7-8 >>> seconds to complete. See if this can be reproduced by running a simple, >>> empty job in spark shell (in the same cluster) and see if the first task >>> takes 7-8 seconds. >>> >>> Either way, I didnt see the 30 second gap, but a 10 second gap. And that >>> does not seem to be a persistent problem as after that 10 seconds, the data >>> is being received and processed. >>> >>> TD -- Informativa sulla Privacy: http://www.unibs.it/node/8155