Hi TD, We are not using stream context with master local, we have 1 Master and 8 Workers and 1 word source. The command line that we are using is: bin/run-example org.apache.spark.streaming.examples.JavaNetworkWordCount spark://192.168.0.13:7077 On Apr 30, 2014, at 0:09, Tathagata Das <tathagata.das1...@gmail.com> wrote:
> Is you batch size 30 seconds by any chance? > > Assuming not, please check whether you are creating the streaming context > with master "local[n]" where n > 2. With "local" or "local[1]", the system > only has one processing slot, which is occupied by the receiver leaving no > room for processing the received data. It could be that after 30 seconds, the > server disconnects, the receiver terminates, releasing the single slot for > the processing to proceed. > > TD > > > On Tue, Apr 29, 2014 at 2:28 PM, Eduardo Costa Alfaia > <e.costaalf...@unibs.it> wrote: > Hi TD, > > In my tests with spark streaming, I'm using JavaNetworkWordCount(modified) > code and a program that I wrote that sends words to the Spark worker, I use > TCP as transport. I verified that after starting Spark, it connects to my > source which actually starts sending, but the first word count is advertised > approximately 30 seconds after the context creation. So I'm wondering where > is stored the 30 seconds data already sent by the source. Is this a normal > spark’s behaviour? I saw the same behaviour using the shipped > JavaNetworkWordCount application. > > Many thanks. > -- > Informativa sulla Privacy: http://www.unibs.it/node/8155 > -- Informativa sulla Privacy: http://www.unibs.it/node/8155