Re: spark streaming, batchinterval,windowinterval and window sliding interval difference

2015-02-27 Thread Jeffrey Jedele
If you read the streaming programming guide, you'll notice that Spark does
not do real streaming but emulates it with a so-called mini-batching
approach. Let's say you want to work with a continuous stream of incoming
events from a computing centre:

Batch interval:
That's the basic heartbeat of your streaming application. If you set this
to 1 second, Spark will create a RDD every second containing the events of
that second. That's your mini-batch of data.

Windowing:
That's a way to do aggregations on your streaming data. Let's say you want
to have a summary of how many warnings your system produced in the last
hour. Then you would use a windowed reduce with a window size of 1h.

Sliding:
This tells Spark how often to perform your windowed operation. If you would
set this to 1h as well, you would aggregate your data stream to consecutive
1h windows of data - no overlap. You could also tell spark to create your
1h aggregation 2 times a day only by setting the sliding interval to 12h.
Or you could tell Spark to create a 1h aggregation every 30 min. Then each
data window overlaps with the previous window of course.

I recommend to carefully read the programming guide- it explains these
concepts pretty well.
https://spark.apache.org/docs/latest/streaming-programming-guide.html

Regards,
Jeff

2015-02-26 18:51 GMT+01:00 Hafiz Mujadid hafizmujadi...@gmail.com:

 Can somebody explain the difference between
 batchinterval,windowinterval and window sliding interval with example.
 If there is any real time use case of using these parameters?


 Thanks



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-batchinterval-windowinterval-and-window-sliding-interval-difference-tp21829.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




spark streaming, batchinterval,windowinterval and window sliding interval difference

2015-02-26 Thread Hafiz Mujadid
Can somebody explain the difference between 
batchinterval,windowinterval and window sliding interval with example.
If there is any real time use case of using these parameters?


Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-batchinterval-windowinterval-and-window-sliding-interval-difference-tp21829.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org