Interesting tangent. I think there will never be a time when an interesting area is covered only by one project, or product. Why are there 30 SQL engines? or 50 car companies? it's a feature not a bug. To the extent they provide different tradeoffs or functionality, they're not entirely duplicative; to the extent they compete directly, it's a win for the user.
As others have said, Flink (née Stratosphere) started quite a while ago. But you can draw lines of influence back earlier than Spark. I presume MS Dryad is the forerunner of all these. And in case you wanted a third option, Google's DataFlow (now Apache Beam) is really a reinvention of FlumeJava (nothing to do with Apache Flume) from Google, in a way that Crunch was a port and minor update of FlumeJava earlier. And it claims to run on Flink/Spark if you want. https://cloud.google.com/dataflow/blog/dataflow-beam-and-spark-comparison On Sun, Apr 17, 2016 at 10:25 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Also it always amazes me why they are so many tangential projects in Big > Data space? Would not it be easier if efforts were spent on adding to Spark > functionality rather than creating a new product like Flink? --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org