mini batch or near real time: processing frames within 500 ms or more real time: processing frames in 5 ms-10ms.
The main difference is processing velocity, i think. Apache Spark Streaming is mini batch, not true real time. Alonso Isidoro Roman [image: https://]about.me/alonso.isidoro.roman <https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links> 2016-09-27 11:15 GMT+02:00 kant kodali <kanth...@gmail.com>: > I understand the difference between fraud detection and fraud prevention > in general but I am not interested in the semantic war on what these terms > precisely mean. I am more interested in understanding the difference > between mini-batch vs real time streaming from CS perspective. > > > > On Tue, Sep 27, 2016 12:54 AM, Mich Talebzadeh mich.talebza...@gmail.com > wrote: > >> Replace mini-batch with micro-batching and do a search again. what is >> your understanding of fraud detection? >> >> Spark streaming can be used for risk calculation and fraud detection >> (including stopping fraud going through for example credit card >> fraud) effectively "in practice". it can even be used for Complex Event >> Processing. >> >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> On 27 September 2016 at 08:12, kant kodali <kanth...@gmail.com> wrote: >> >> What is the difference between mini-batch vs real time streaming in >> practice (not theory)? In theory, I understand mini batch is something that >> batches in the given time frame whereas real time streaming is more like do >> something as the data arrives but my biggest question is why not have mini >> batch with epsilon time frame (say one millisecond) or I would like to >> understand reason why one would be an effective solution than other? >> I recently came across one example where mini-batch (Apache Spark) is >> used for Fraud detection and real time streaming (Apache Flink) used for >> Fraud Prevention. Someone also commented saying mini-batches would not be >> an effective solution for fraud prevention (since the goal is to prevent >> the transaction from occurring as it happened) Now I wonder why this >> wouldn't be so effective with mini batch (Spark) ? Why is it not effective >> to run mini-batch with 1 millisecond latency? Batching is a technique used >> everywhere including the OS and the Kernel TCP/IP stack where the data to >> the disk or network are indeed buffered so what is the convincing factor >> here to say one is more effective than other? >> Thanks, >> kant >> >> >>