i never found much info that flink was actually designed to be fault tolerant. if fault tolerance is more bolt-on/add-on/afterthought then that doesn't bode well for large scale data processing. spark was designed with fault tolerance in mind from the beginning.
On Sun, Apr 17, 2016 at 9:52 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > I read the benchmark published by Yahoo. Obviously they already use Storm > and inevitably very familiar with that tool. To start with although these > benchmarks were somehow interesting IMO, it lend itself to an assurance > that the tool chosen for their platform is still the best choice. So > inevitably the benchmarks and the tests were done to support primary their > approach. > > In general anything which is not done through TCP Council or similar body > is questionable.. > Their argument is that because Spark handles data streaming in micro > batches then inevitably it introduces this in-built latency as per design. > In contrast, both Storm and Flink do not (at the face value) have this > issue. > > In addition as we already know Spark has far more capabilities compared to > Flink (know nothing about Storm). So really it boils down to the business > SLA to choose which tool one wants to deploy for your use case. IMO Spark > micro batching approach is probably OK for 99% of use cases. If we had in > built libraries for CEP for Spark (I am searching for it), I would not > bother with Flink. > > HTH > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 17 April 2016 at 12:47, Ovidiu-Cristian MARCU < > ovidiu-cristian.ma...@inria.fr> wrote: > >> You probably read this benchmark at Yahoo, any comments from Spark? >> >> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at >> >> >> On 17 Apr 2016, at 12:41, andy petrella <andy.petre...@gmail.com> wrote: >> >> Just adding one thing to the mix: `that the latency for streaming data is >> eliminated` is insane :-D >> >> On Sun, Apr 17, 2016 at 12:19 PM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> It seems that Flink argues that the latency for streaming data is >>> eliminated whereas with Spark RDD there is this latency. >>> >>> I noticed that Flink does not support interactive shell much like Spark >>> shell where you can add jars to it to do kafka testing. The advice was to >>> add the streaming Kafka jar file to CLASSPATH but that does not work. >>> >>> Most Flink documentation also rather sparce with the usual example of >>> word count which is not exactly what you want. >>> >>> Anyway I will have a look at it further. I have a Spark Scala streaming >>> Kafka program that works fine in Spark and I want to recode it using Scala >>> for Flink with Kafka but have difficulty importing and testing libraries. >>> >>> Cheers >>> >>> Dr Mich Talebzadeh >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 17 April 2016 at 02:41, Ascot Moss <ascot.m...@gmail.com> wrote: >>> >>>> I compared both last month, seems to me that Flink's MLLib is not yet >>>> ready. >>>> >>>> On Sun, Apr 17, 2016 at 12:23 AM, Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Thanks Ted. I was wondering if someone is using both :) >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> >>>>> >>>>> On 16 April 2016 at 17:08, Ted Yu <yuzhih...@gmail.com> wrote: >>>>> >>>>>> Looks like this question is more relevant on flink mailing list :-) >>>>>> >>>>>> On Sat, Apr 16, 2016 at 8:52 AM, Mich Talebzadeh < >>>>>> mich.talebza...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Has anyone used Apache Flink instead of Spark by any chance >>>>>>> >>>>>>> I am interested in its set of libraries for Complex Event Processing. >>>>>>> >>>>>>> Frankly I don't know if it offers far more than Spark offers. >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Dr Mich Talebzadeh >>>>>>> >>>>>>> >>>>>>> LinkedIn * >>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>>> >>>>>>> >>>>>>> http://talebzadehmich.wordpress.com >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> -- >> andy >> >> >> >