I understand the difference between fraud detection and fraud prevention in
general but I am not interested in the semantic war on what these terms
precisely mean. I am more interested in understanding the difference
between mini-batch vs real time streaming from CS perspective.
 





On Tue, Sep 27, 2016 12:54 AM, Mich Talebzadeh mich.talebza...@gmail.com
wrote:
Replace mini-batch with micro-batching and do a search again. what is your
understanding of fraud detection?
Spark streaming can be used for risk calculation and fraud detection (including
stopping fraud going through for example credit card fraud) effectively "in
practice". it can even be used for Complex Event Processing.

HTH
Dr Mich Talebzadeh



LinkedIn 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com




Disclaimer: Use it at your own risk.  Any and all responsibility for any loss,
damage or destruction
of data or any other property which may arise from relying on this
email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from such
loss, damage or destruction. 




On 27 September 2016 at 08:12, kant kodali <kanth...@gmail.com>  wrote:
What is the difference between mini-batch vs real time streaming in practice
(not theory)? In theory, I understand mini batch is something that batches in
the given time frame whereas real time streaming is more like do something as
the data arrives but my biggest question is why not have mini batch with epsilon
time frame (say one millisecond) or I would like to understand reason why one
would be an effective solution than other?I recently came across one example
where mini-batch (Apache Spark) is used for Fraud detection and real time
streaming (Apache Flink) used for Fraud Prevention. Someone also commented
saying mini-batches would not be an effective solution for fraud prevention
(since the goal is to prevent the transaction from occurring as it happened) Now
I wonder why this wouldn't be so effective with mini batch (Spark) ? Why is it
not effective to run mini-batch with 1 millisecond latency? Batching is a
technique used everywhere including the OS and the Kernel TCP/IP stack where the
data to the disk or network are indeed buffered so what is the convincing factor
here to say one is more effective than other?Thanks,kant

Reply via email to