Re: Is "spark streaming" streaming or mini-batch?

2016-08-24 Thread Mich Talebzadeh
Is "spark streaming" streaming or mini-batch? I look at something Like Complex Event Processing (CEP) which is a leader use case for data streaming (and I am experimenting with Spark for it) and in the realm of CEP there is really no such thing as continuous data streaming. The point is that when

Re: Is "spark streaming" streaming or mini-batch?

2016-08-24 Thread Steve Loughran
On 23 Aug 2016, at 17:58, Mich Talebzadeh > wrote: In general depending what you are doing you can tighten above parameters. For example if you are using Spark Streaming for Anti-fraud detection, you may stream data in at 2 seconds

Re: Is "spark streaming" streaming or mini-batch?

2016-08-23 Thread Matei Zaharia
I think people explained this pretty well, but in practice, this distinction is also somewhat of a marketing term, because every system will perform some kind of batching. For example, every time you use TCP, the OS and network stack may buffer multiple messages together and send them at once;

Re: Is "spark streaming" streaming or mini-batch?

2016-08-23 Thread Aseem Bansal
Thanks everyone for clarifying. On Tue, Aug 23, 2016 at 9:11 PM, Aseem Bansal wrote: > I was reading this article https://www.inovex.de/blog/storm-in-a-teacup/ > and it mentioned that spark streaming actually mini-batch not actual > streaming. > > I have not used streaming

Re: Is "spark streaming" streaming or mini-batch?

2016-08-23 Thread Mich Talebzadeh
Russell Is correct here. micro-batch means it does processing within a window. In general there are three things here. batch window This is the basic interval at which the system with receive the data in batches. This is the interval set when creating a StreamingContext. For example, if you set

Re: Is "spark streaming" streaming or mini-batch?

2016-08-23 Thread Russell Spitzer
Spark streaming does not process 1 event at a time which is in general I think what people call "Streaming." It instead processes groups of events. Each group is a "MicroBatch" that gets processed at the same time. Streaming theoretically always has better latency because the event is processed

Re: Is "spark streaming" streaming or mini-batch?

2016-08-23 Thread pandees waran
It's based on "micro batching" model. Sent from my iPhone > On Aug 23, 2016, at 8:41 AM, Aseem Bansal wrote: > > I was reading this article https://www.inovex.de/blog/storm-in-a-teacup/ and > it mentioned that spark streaming actually mini-batch not actual streaming. >