Re: Apache Beam and Flink
Hi Ashutosh There is a related open JIRA: Enable DataSet and DataStream Joins https://issues.apache.org/jira/browse/FLINK-2320 <https://issues.apache.org/jira/browse/FLINK-2320> Slim > On May 26, 2016, at 3:05 AM, Fabian Hueske <fhue...@gmail.com> wrote: > > No, that is not supported yet. > Beam provides a common API but the Flink runner translates programs against > batch sources into the DataSet API programs and Beam programs against > streaming source into DataStream programs. > It is not possible to mix both. > > 2016-05-26 10:00 GMT+02:00 Ashutosh Kumar <kmr.ashutos...@gmail.com > <mailto:kmr.ashutos...@gmail.com>>: > Thanks . So if we use Beam API with flink engine then we can get inter action > between batch and stream ? As i know currently in flink Dataset and DStream > can not talk . Is this correct ? > Thanks > Ashutosh > > > On Thu, May 26, 2016 at 1:09 PM, Slim Baltagi <sbalt...@gmail.com > <mailto:sbalt...@gmail.com>> wrote: > Hi Ashutosh > > Apache Beam provides a Unified API for batch and streaming. > It also supports multiple ‘runners’: local, Apache Spark, Apache Flink and > Google Cloud Data Flow (commercial service). > It is not an alternative to Flink because it is an API and you still need an > execution engine. > It can be used as an alternative API to using the two Flink APIs : DataSet > API and DataStream API. > It can be complementary to Flink in the way that you use Beam as API and > Flink as the execution engine. > Many of Flink committers are also Apache Beam committers! > The following blogs describe why Apache Beam: > from Flink perspective: http://data-artisans.com/why-apache-beam/ > <http://data-artisans.com/why-apache-beam/> > from Google perspective. > https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective > > <https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective> > > A few recent resources about Apache Beam published this month: May 2016 > Running Apache Beam (screencast) https://www.youtube.com/watch?v=dwxUbzbwtyI > <https://www.youtube.com/watch?v=dwxUbzbwtyI> > Introduction to Apache Beam ( presentation) > https://skillsmatter.com/skillscasts/8036-apache-flink-may-meetup > <https://skillsmatter.com/skillscasts/8036-apache-flink-may-meetup> > Introduction to Apache Beam ( blog) > http://www.talend.com/blog/2016/05/02/introduction-to-apache-beam > <http://www.talend.com/blog/2016/05/02/introduction-to-apache-beam> > > I hope this helps. > > Thanks > > Slim Baltagi > >> On May 26, 2016, at 2:20 AM, Ashutosh Kumar <kmr.ashutos...@gmail.com >> <mailto:kmr.ashutos...@gmail.com>> wrote: >> >> How does apache beam fits with flink ? Is it an alternative for flink or >> complementary to it ? >> >> Thanks >> Ashutosh > > >
Re: Apache Beam and Flink
Hi Ashutosh Apache Beam provides a Unified API for batch and streaming. It also supports multiple ‘runners’: local, Apache Spark, Apache Flink and Google Cloud Data Flow (commercial service). It is not an alternative to Flink because it is an API and you still need an execution engine. It can be used as an alternative API to using the two Flink APIs : DataSet API and DataStream API. It can be complementary to Flink in the way that you use Beam as API and Flink as the execution engine. Many of Flink committers are also Apache Beam committers! The following blogs describe why Apache Beam: from Flink perspective: http://data-artisans.com/why-apache-beam/ from Google perspective. https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective A few recent resources about Apache Beam published this month: May 2016 Running Apache Beam (screencast) https://www.youtube.com/watch?v=dwxUbzbwtyI Introduction to Apache Beam ( presentation) https://skillsmatter.com/skillscasts/8036-apache-flink-may-meetup Introduction to Apache Beam ( blog) http://www.talend.com/blog/2016/05/02/introduction-to-apache-beam I hope this helps. Thanks Slim Baltagi > On May 26, 2016, at 2:20 AM, Ashutosh Kumar <kmr.ashutos...@gmail.com> wrote: > > How does apache beam fits with flink ? Is it an alternative for flink or > complementary to it ? > > Thanks > Ashutosh
Re: Powered by Flink
Hi The following are missing in the ‘Powered by Flink’ list: king.com https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces88 Otto Group http://data-artisans.com/how-we-selected-apache-flink-at-otto-group/ <http://data-artisans.com/how-we-selected-apache-flink-at-otto-group/> Eura Nova https://research.euranova.eu/flink-forward-2015-talk/ <https://research.euranova.eu/flink-forward-2015-talk/> Big Data Europe http://www.big-data-europe.eu Thanks Slim Baltagi > On Apr 5, 2016, at 10:08 AM, Robert Metzger <rmetz...@apache.org> wrote: > > Hi everyone, > > I would like to bring the "Powered by Flink" wiki page [1] to the attention > of Flink user's who recently joined the Flink community. The list tracks > which organizations are using Flink. > If your company / university / research institute / ... is using Flink but > the name is not yet listed there, let me know and I'll add the name. > > Regards, > Robert > > [1] https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink > <https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink> > > > On Mon, Oct 19, 2015 at 4:10 PM, Matthias J. Sax <mj...@apache.org > <mailto:mj...@apache.org>> wrote: > +1 > > On 10/19/2015 04:05 PM, Maximilian Michels wrote: > > +1 Let's collect in the Wiki for now. At some point in time, we might > > want to have a dedicated page on the Flink homepage. > > > > On Mon, Oct 19, 2015 at 3:31 PM, Timo Walther <twal...@apache.org > > <mailto:twal...@apache.org>> wrote: > >> Ah ok, sorry. I think linking to the wiki is also ok. > >> > >> > >> On 19.10.2015 15:18, Fabian Hueske wrote: > >>> > >>> @Timo: The proposal was to keep the list in the wiki (can be easily > >>> extended) but link from the main website to the wiki page. > >>> > >>> 2015-10-19 15:16 GMT+02:00 Timo Walther <twal...@apache.org > >>> <mailto:twal...@apache.org>>: > >>> > >>>> +1 for adding it to the website instead of wiki. > >>>> "Who is using Flink?" is always a question difficult to answer to > >>>> interested users. > >>>> > >>>> > >>>> On 19.10.2015 15:08, Suneel Marthi wrote: > >>>> > >>>> +1 to this. > >>>> > >>>> On Mon, Oct 19, 2015 at 3:00 PM, Fabian Hueske <fhue...@gmail.com > >>>> <mailto:fhue...@gmail.com>> wrote: > >>>> > >>>>> Sounds good +1 > >>>>> > >>>>> 2015-10-19 14:57 GMT+02:00 Márton Balassi < <balassi.mar...@gmail.com > >>>>> <mailto:balassi.mar...@gmail.com>> > >>>>> balassi.mar...@gmail.com <mailto:balassi.mar...@gmail.com>>: > >>>>> > >>>>>> Thanks for starting and big +1 for making it more prominent. > >>>>>> > >>>>>> On Mon, Oct 19, 2015 at 2:53 PM, Fabian Hueske < <fhue...@gmail.com > >>>>>> <mailto:fhue...@gmail.com>> > >>>>> > >>>>> fhue...@gmail.com <mailto:fhue...@gmail.com>> wrote: > >>>>>>> > >>>>>>> Thanks for starting this Kostas. > >>>>>>> > >>>>>>> I think the list is quite hidden in the wiki. Should we link from > >>>>>>> flink.apache.org <http://flink.apache.org/> to that page? > >>>>>>> > >>>>>>> Cheers, Fabian > >>>>>>> > >>>>>>> 2015-10-19 14:50 GMT+02:00 Kostas Tzoumas < <ktzou...@apache.org > >>>>>>> <mailto:ktzou...@apache.org>> > >>>>> > >>>>> ktzou...@apache.org <mailto:ktzou...@apache.org>>: > >>>>>>>> > >>>>>>>> Hi everyone, > >>>>>>>> > >>>>>>>> I started a "Powered by Flink" wiki page, listing some of the > >>>>>>>> organizations that are using Flink: > >>>>>>>> > >>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink > >>>>>>>> <https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink> > >>>>>>>> > >>>>>>>> If you would like to be added to the list, just send me a short email > >>>>>>>> with your organization's name and a description and I will add you to > >>>>> > >>>>> the > >>>>>>>> > >>>>>>>> wiki page. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Kostas > >>>>>>>> > >>>>>>> > >>>> > >>>> > >> > >
Re: [VOTE] Release Apache Flink 1.0.0 (RC1)
Dear Flink community It is great news that the vote for the first release candidate (RC1) of Apache Flink 1.0.0 is starting today February 25th, 2016! As a community, we need to double our efforts and make sure that Flink 1.0.0 is GA before these 2 upcoming major events: Strata + Hadoop World in San Jose on March 28-31, 2016 Hadoop Summit Europe in Dublin on April 13-14, 2016 This is one aspect of the ‘market dynamics’ that we need to take into account as a community. Good luck! Slim Baltagi On Feb 25, 2016, at 4:34 AM, Robert Metzger <rmetz...@apache.org> wrote: > Dear Flink community, > > Please vote on releasing the following candidate as Apache Flink version > 1.0.0. > > I've set user@flink.apache.org on CC because users are encouraged to help > testing Flink 1.0.0 for their specific use cases. Please report issues (and > successful tests!) on d...@flink.apache.org. > > > The commit to be voted on > (http://git-wip-us.apache.org/repos/asf/flink/commit/e4d308d6) > e4d308d64057e5f94bec8bbca8f67aab0ea78faa > > Branch: > release-1.0.0-rc1 (see > https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.0.0-rc1) > > The release artifacts to be voted on can be found at: > http://people.apache.org/~rmetzger/flink-1.0.0-rc1/ > > The release artifacts are signed with the key with fingerprint D9839159: > http://www.apache.org/dist/flink/KEYS > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapacheflink-1063 > > - > > The vote is open until Tuesday and passes if a majority of at least three +1 > PMC votes are cast. > > The vote ends on Tuesday, March 1, 12:00 CET. > > [ ] +1 Release this package as Apache Flink 1.0.0 > [ ] -1 Do not release this package because ...
Re: Comparison of storm and flink
Hi Vinaya 1. Comparing streaming tools ( in this case Storm and Flink) should not be based on performance benchmarks only! For example, slides 16-36 list over 96 criteria, that we identified at Capital One, to compare two streaming tools http://www.slideshare.net/sbaltagi/flink-vs-spark/17 2. Now, if you are focusing on performance only, I'll suggest a few related resources: - Benchmarking Streaming Computation Engines at Yahoo! http://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at December 16, 2015 Code at github: https://github.com/yahoo/streaming-benchmarks - There is some work started by some Flink contributors to create some performance scripts for Flink, Spark, and MapReduce here: There is Apache Flink: Performance and Testing https://github.com/project-flink/flink-perf - Some first numbers on performance of streaming jobs with Apache Flink are here: http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/ under the section: 'Show me the numbers'. Code used is at: https://github.com/dataArtisans/performance - Yangjun Wang is currently working on his Master thesis at Aalto university in Helsinki, Finland. The topic of his thesis is about building a standard benchmark system for streaming processing systems like Apache Storm, Spark and Flink. Code at github https://github.com/wangyangjun/StreamBench/tree/master/StreamBench 3. I am giving a talk in NYC on Tuesday February 2nd, 2016 on Apache Flink and I will be touching a bit on benchmarks http://www.meetup.com/New-York-City-NYC-Apache-Flink-Meetup/events/228113118/ You are welcome to attend. Thanks Slim Baltagi -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Comparison-of-storm-and-flink-tp4468p4469.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: 2015: A Year in Review for Apache Flink
Happy New Year to you and your families! Let’s make 2016 the year of Flink: General Availability, faster growth, wider industry adoption, … Slim Baltagi Chicago, US On Dec 31, 2015, at 5:05 AM, Vasiliki Kalavri <vasilikikala...@gmail.com> wrote: > Happy new year everyone! > Looking forward to all the great things the Apache Flink community will > accomplish in 2016 :)) > > Greetings from snowy Greece! > -Vasia. > > On 31 December 2015 at 04:22, Henry Saputra <henry.sapu...@gmail.com> wrote: > Dear All, > > It is almost end of 2015 and it has been busy and great year for Apache Flink > =) > > Robert Metzger had posted great blog summarizing Apache Flink grow for > this year: > > https://flink.apache.org/news/2015/12/18/a-year-in-review.html > > Happy New Year everyone and thanks for being part of this great community! > > > Thanks, > > - Henry >
Re: Apache Flink 0.10.0 released
Hi I’m very pleased to be first to tweet about the release of Apache Flink 0.10.0 just after receiving Fabian’s email :) Flink 1.0 is around the corner now! Slim Baltagi On Nov 16, 2015, at 7:53 AM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi everybody, > > The Flink community is excited to announce that Apache Flink 0.10.0 has been > released. > Please find the release announcement here: > > --> http://flink.apache.org/news/2015/11/16/release-0.10.0.html > > Best, > Fabian
Building Big Data Benchmarking suite for Apache Flink
Hi BigDataBench is an open source Big Data Benchmarking suite from both industry and academia. As a subset of BigDataBench, BigDataBench-DCA is China’s first industry-standard big data benchmark suite: http://prof.ict.ac.cn/BigDataBench/industry-standard-benchmarks/ It comes with real-world data sets and many workloads: TeraSort, WordCount, PageRank, K-means, NaiveBayes, Aggregation and Read/Write/Scan and also a tool that uses Hadoop, HBase and Mahout. This might be inspiring to build a Big Data Benchmarking suite for Flink! I would like to share with you the news that professor Jianfeng Zhan from the Institute of Computing Technology, Chinese Academy of Sciences is planning to support Flink in the BigDataBench project! Reference: https://www.linkedin.com/grp/home?gid=6777483 Thanks Slim Baltagi -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Building-Big-Data-Benchmarking-suite-for-Apache-Flink-tp2035.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: SLIDES: Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Hi Well, thanks to the Apache Flink community for continuously improving the project docs and to Data Artisans for sharing the slides and materials of the Apache Flink training!! Both helped me with putting together the slide deck of my talk in our Chicago Apache Flink meetup. Slim Baltagi -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/SLIDES-Overview-of-Apache-Flink-Next-Gen-Big-Data-Analytics-Framework-tp1966p1972.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
SLIDES: Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Hi This is the link *http://goo.gl/gVOSp8* to the slides of my talk on June 30, 2015 at the Chicago Apache Flink meetup. Although most of the current buzz is about Apache Spark, the talk shows how Apache Flink offers the only hybrid open source (Real-Time Streaming + Batch) distributed data processing engine supporting many use cases: Real-Time stream processing, machine learning at scale, graph analytics and batch processing. Many slides are also dedicated to showing why Apache Flink is an alternative to Apache Hadoop MapReduce, Apache Storm and Apache Spark! Thanks Slim Baltagi -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/SLIDES-Overview-of-Apache-Flink-Next-Gen-Big-Data-Analytics-Framework-tp1966.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.