If you read the streaming programming guide, you'll notice that Spark does not do "real" streaming but "emulates" it with a so-called mini-batching approach. Let's say you want to work with a continuous stream of incoming events from a computing centre:
Batch interval: That's the basic "heartbeat" of your streaming application. If you set this to 1 second, Spark will create a RDD every second containing the events of that second. That's your "mini-batch" of data. Windowing: That's a way to do aggregations on your streaming data. Let's say you want to have a summary of how many warnings your system produced in the last hour. Then you would use a windowed reduce with a window size of 1h. Sliding: This tells Spark how often to perform your windowed operation. If you would set this to 1h as well, you would aggregate your data stream to consecutive 1h windows of data - no overlap. You could also tell spark to create your 1h aggregation 2 times a day only by setting the sliding interval to 12h. Or you could tell Spark to create a 1h aggregation every 30 min. Then each data window overlaps with the previous window of course. I recommend to carefully read the programming guide- it explains these concepts pretty well. https://spark.apache.org/docs/latest/streaming-programming-guide.html Regards, Jeff 2015-02-26 18:51 GMT+01:00 Hafiz Mujadid <hafizmujadi...@gmail.com>: > Can somebody explain the difference between > batchinterval,windowinterval and window sliding interval with example. > If there is any real time use case of using these parameters? > > > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-batchinterval-windowinterval-and-window-sliding-interval-difference-tp21829.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >