mini batch or near real time: processing frames within 500 ms or more

real time: processing frames in 5 ms-10ms.

The main difference is processing velocity, i think.

Apache Spark Streaming is mini batch, not true real time.

Alonso Isidoro Roman
[image: https://]about.me/alonso.isidoro.roman
<https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>

2016-09-27 11:15 GMT+02:00 kant kodali <kanth...@gmail.com>:

> I understand the difference between fraud detection and fraud prevention
> in general but I am not interested in the semantic war on what these terms
> precisely mean. I am more interested in understanding the difference
> between mini-batch vs real time streaming from CS perspective.
>
>
>
> On Tue, Sep 27, 2016 12:54 AM, Mich Talebzadeh mich.talebza...@gmail.com
> wrote:
>
>> Replace mini-batch with micro-batching and do a search again. what is
>> your understanding of fraud detection?
>>
>> Spark streaming can be used for risk calculation and fraud detection
>> (including stopping fraud going through for example credit card
>> fraud) effectively "in practice". it can even be used for Complex Event
>> Processing.
>>
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 27 September 2016 at 08:12, kant kodali <kanth...@gmail.com> wrote:
>>
>> What is the difference between mini-batch vs real time streaming in
>> practice (not theory)? In theory, I understand mini batch is something that
>> batches in the given time frame whereas real time streaming is more like do
>> something as the data arrives but my biggest question is why not have mini
>> batch with epsilon time frame (say one millisecond) or I would like to
>> understand reason why one would be an effective solution than other?
>> I recently came across one example where mini-batch (Apache Spark) is
>> used for Fraud detection and real time streaming (Apache Flink) used for
>> Fraud Prevention. Someone also commented saying mini-batches would not be
>> an effective solution for fraud prevention (since the goal is to prevent
>> the transaction from occurring as it happened) Now I wonder why this
>> wouldn't be so effective with mini batch (Spark) ? Why is it not effective
>> to run mini-batch with 1 millisecond latency? Batching is a technique used
>> everywhere including the OS and the Kernel TCP/IP stack where the data to
>> the disk or network are indeed buffered so what is the convincing factor
>> here to say one is more effective than other?
>> Thanks,
>> kant
>>
>>
>>

Reply via email to