Re: Difference among batchDuration, windowDuration, slideDuration

2015-03-18 Thread jaredtims
I think hsy541 is still confused by what is still confusing to me. Namely, what is the value that sentence Each RDD in a DStream contains data from a certain interval is speaking of? This is from the Discretized Streams

Re: Difference among batchDuration, windowDuration, slideDuration

2014-07-17 Thread hsy...@gmail.com
Thanks Tathagata, so can I say RDD size(from the stream) is window size. and the overlap between 2 adjacent RDDs are sliding size. But I still don't understand what it batch size, why do we need this since data processing is RDD by RDD right? And does spark chop the data into RDDs at the very

Re: Difference among batchDuration, windowDuration, slideDuration

2014-07-16 Thread aaronjosephs
The only other thing to keep in mind is that window duration and slide duration have to be multiples of batch duration, IDK if you made that fully clear -- View this message in context: