Hi dev,

I did some literature reading about Storm vs Flink, with an emphasis on our 
use-case of Distributed Task Execution and my initial impressions are as 
follows (I will also be updating the Google docs accordingly):


1.       Although both Storm and Flink engines appear to be similar, for 
supporting pipeline processing; Storm can only handle data streams, whereas 
Flink supports stream and batch processing. This allows Flink to perform data 
transfer between parallel tasks – we do not have such support as of today, but 
we can definitely think of parallel task execution.

2.       Storm supports at-least once and at-most once data processing, whereas 
Flink guarantees exactly-once processing. Storm also supports exactly-once via 
their Trident API. From what I read, Flink claims to be more efficient in terms 
of processing semantics – as they use a lighter algorithm for check-pointing 
data transfers.

3.       There are high level APIs available in Flink to simplify the data 
collection process, which is a little tedious in Storm. In Storm one needs to 
manually implement readers and collectors, whereas Flink provides functions 
such as Map, GroupBy, Window and Join.

4.       A major positive in Flink is the ability to maintain custom State 
information in operators/executors. This custom state information can also be 
used in check-pointing for fault tolerance.

I think Flink is an improvement over Storm, but this is just an understanding 
from my initial readings. I haven’t yet tried coding any examples in Flink. 
Again, most of the features/differences mentioned above, offered by both Storm 
and Flink, are for stream processing with focus on executing a large number of 
small tasks (in parallel?) with continuous streaming data and therefore the 
fight is for offering low latency processing; these might not necessarily be 
that important for the Airavata use-case (tasks may take time to complete).

Thanks and Regards,
Gourav Shenoy

From: "Pierce, Marlon" <[email protected]>
Reply-To: <[email protected]>
Date: Wednesday, May 24, 2017 at 11:36 AM
To: "[email protected]" <[email protected]>
Subject: Re: Apache Flink Execution

Thanks, Apoorv.  Note for everyone else: request access if you’d like to leave 
a comment or make a suggestion.

Marlon

From: Apoorv Palkar <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Wednesday, May 24, 2017 at 11:32 AM
To: "[email protected]" <[email protected]>
Subject: Apache Flink Execution

https://docs.google.com/document/d/1GDh8kEbAXVY9Gv1mmFvq__zLN_JP6m2_KbfN-9C0uO0/edit?usp=sharing

LINK for Flink Use/fundamental

Reply via email to