This is awesome, Stephan! Thanks for doing this. -Jamie
On Tue, Feb 26, 2019 at 9:29 AM Stephan Ewen <se...@apache.org> wrote: > Here is the pull request with a draft of the roadmap: > https://github.com/apache/flink-web/pull/178 > > Best, > Stephan > > On Fri, Feb 22, 2019 at 5:18 AM Hequn Cheng <chenghe...@gmail.com> wrote: > >> Hi Stephan, >> >> Thanks for summarizing the great roadmap! It is very helpful for users >> and developers to track the direction of Flink. >> +1 for putting the roadmap on the website and update it per release. >> >> Besides, would be great if the roadmap can add the UpsertSource >> feature(maybe put it under 'Batch Streaming Unification'). >> It has been discussed a long time ago[1,2] and is moving forward step by >> step. >> Currently, Flink can only emit upsert results. With the UpsertSource, we >> can make our system a more complete one. >> >> Best, Hequn >> >> [1] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-TABLE-How-to-handle-empty-delete-for-UpsertSource-td23856.html#a23874 >> [2] https://issues.apache.org/jira/browse/FLINK-8545 >> <https://issues.apache.org/jira/browse/FLINK-8545> >> >> >> >> On Fri, Feb 22, 2019 at 3:31 AM Rong Rong <walter...@gmail.com> wrote: >> >>> Hi Stephan, >>> >>> Yes. I completely agree. Jincheng & Jark gave some very valuable >>> feedbacks and suggestions and I think we can definitely move the >>> conversation forward to reach a more concrete doc first before we put in to >>> the roadmap. Thanks for reviewing it and driving the roadmap effort! >>> >>> -- >>> Rong >>> >>> On Thu, Feb 21, 2019 at 8:50 AM Stephan Ewen <se...@apache.org> wrote: >>> >>>> Hi Rong Rong! >>>> >>>> I would add the security / kerberos threads to the roadmap. They seem >>>> to be advanced enough in the discussions so that there is clarity what will >>>> come. >>>> >>>> For the window operator with slicing, I would personally like to see >>>> the discussion advance and have some more clarity and consensus on the >>>> feature before adding it to the roadmap. Not having that in the first >>>> version of the roadmap does not mean there will be no activity. And when >>>> the discussion advances well in the next weeks, we can update the roadmap >>>> soon. >>>> >>>> What do you think? >>>> >>>> Best, >>>> Stephan >>>> >>>> >>>> On Thu, Feb 14, 2019 at 5:46 PM Rong Rong <walter...@gmail.com> wrote: >>>> >>>>> Hi Stephan, >>>>> >>>>> Thanks for the clarification, yes I think these issues has already >>>>> been discussed in previous mailing list threads [1,2,3]. >>>>> >>>>> I also agree that updating the "official" roadmap every release is a >>>>> very good idea to avoid frequent update. >>>>> One question I might've been a bit confusion is: are we suggesting to >>>>> keep one roadmap on the documentation site (e.g. [4]) per release, or >>>>> simply just one most up-to-date roadmap in the main website [5] ? >>>>> Just like the release notes in every release, the former will probably >>>>> provide a good tracker for users to look back at previous roadmaps as well >>>>> I am assuming. >>>>> >>>>> Thanks, >>>>> Rong >>>>> >>>>> [1] >>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html >>>>> [2] >>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html >>>>> [3] >>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html >>>>> >>>>> [4] https://ci.apache.org/projects/flink/flink-docs-release-1.7/ >>>>> [5] https://flink.apache.org/ >>>>> >>>>> On Thu, Feb 14, 2019 at 2:26 AM Stephan Ewen <se...@apache.org> wrote: >>>>> >>>>>> I think the website is better as well. >>>>>> >>>>>> I agree with Fabian that the wiki is not so visible, and visibility >>>>>> is the main motivation. >>>>>> This type of roadmap overview would not be updated by everyone - >>>>>> letting committers update the roadmap means the listed threads are >>>>>> actually >>>>>> happening at the moment. >>>>>> >>>>>> >>>>>> On Thu, Feb 14, 2019 at 11:14 AM Fabian Hueske <fhue...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I like the idea of putting the roadmap on the website because it is >>>>>>> much more visible (and IMO more credible, obligatory) there. >>>>>>> However, I share the concerns about frequent updates. >>>>>>> >>>>>>> It think it would be great to update the "official" roadmap on the >>>>>>> website once per release (-bugfix releases), i.e., every three month. >>>>>>> We can use the wiki to collect and draft the roadmap for the next >>>>>>> update. >>>>>>> >>>>>>> Best, Fabian >>>>>>> >>>>>>> >>>>>>> Am Do., 14. Feb. 2019 um 11:03 Uhr schrieb Jeff Zhang < >>>>>>> zjf...@gmail.com>: >>>>>>> >>>>>>>> Hi Stephan, >>>>>>>> >>>>>>>> Thanks for this proposal. It is a good idea to track the roadmap. >>>>>>>> One suggestion is that it might be better to put it into wiki page >>>>>>>> first. >>>>>>>> Because it is easier to update the roadmap on wiki compared to on >>>>>>>> flink web >>>>>>>> site. And I guess we may need to update the roadmap very often at the >>>>>>>> beginning as there's so many discussions and proposals in community >>>>>>>> recently. We can move it into flink web site later when we feel it >>>>>>>> could be >>>>>>>> nailed down. >>>>>>>> >>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月14日周四 下午5:44写道: >>>>>>>> >>>>>>>>> Thanks Jincheng and Rong Rong! >>>>>>>>> >>>>>>>>> I am not deciding a roadmap and making a call on what features >>>>>>>>> should be developed or not. I was only collecting broader issues that >>>>>>>>> are >>>>>>>>> already happening or have an active FLIP/design discussion plus >>>>>>>>> committer >>>>>>>>> support. >>>>>>>>> >>>>>>>>> Do we have that for the suggested issues as well? If yes , we can >>>>>>>>> add them (can you point me to the issue/mail-thread), if not, let's >>>>>>>>> try and >>>>>>>>> move the discussion forward and add them to the roadmap overview then. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Stephan >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Thanks Stephan for the great proposal. >>>>>>>>>> >>>>>>>>>> This would not only be beneficial for new users but also for >>>>>>>>>> contributors to keep track on all upcoming features. >>>>>>>>>> >>>>>>>>>> I think that better window operator support can also be >>>>>>>>>> separately group into its own category, as they affects both future >>>>>>>>>> DataStream API and batch stream unification. >>>>>>>>>> can we also include: >>>>>>>>>> - OVER aggregate for DataStream API separately as @jincheng >>>>>>>>>> suggested. >>>>>>>>>> - Improving sliding window operator [1] >>>>>>>>>> >>>>>>>>>> One more additional suggestion, can we also include a more >>>>>>>>>> extendable security module [2,3] @shuyi and I are currently working >>>>>>>>>> on? >>>>>>>>>> This will significantly improve the usability for Flink in >>>>>>>>>> corporate environments where proprietary or 3rd-party security >>>>>>>>>> integration >>>>>>>>>> is needed. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Rong >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html >>>>>>>>>> [2] >>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html >>>>>>>>>> [3] >>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun < >>>>>>>>>> sunjincheng...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Very excited and thank you for launching such a great >>>>>>>>>>> discussion, Stephan ! >>>>>>>>>>> >>>>>>>>>>> Here only a little suggestion that in the Batch Streaming >>>>>>>>>>> Unification section, do we need to add an item: >>>>>>>>>>> >>>>>>>>>>> - Same window operators on bounded/unbounded Table API and >>>>>>>>>>> DataStream API >>>>>>>>>>> (currently OVER window only exists in SQL/TableAPI, DataStream >>>>>>>>>>> API does not yet support) >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Jincheng >>>>>>>>>>> >>>>>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道: >>>>>>>>>>> >>>>>>>>>>>> Hi all! >>>>>>>>>>>> >>>>>>>>>>>> Recently several contributors, committers, and users asked >>>>>>>>>>>> about making it more visible in which way the project is currently >>>>>>>>>>>> going. >>>>>>>>>>>> >>>>>>>>>>>> Users and developers can track the direction by following the >>>>>>>>>>>> discussion threads and JIRA, but due to the mass of discussions >>>>>>>>>>>> and open >>>>>>>>>>>> issues, it is very hard to get a good overall picture. >>>>>>>>>>>> Especially for new users and contributors, is is very hard to >>>>>>>>>>>> get a quick overview of the project direction. >>>>>>>>>>>> >>>>>>>>>>>> To fix this, I suggest to add a brief roadmap summary to the >>>>>>>>>>>> homepage. It is a bit of a commitment to keep that roadmap up to >>>>>>>>>>>> date, but >>>>>>>>>>>> I think the benefit for users justifies that. >>>>>>>>>>>> The Apache Beam project has added such a roadmap [1] >>>>>>>>>>>> <https://beam.apache.org/roadmap/>, which was received very >>>>>>>>>>>> well by the community, I would suggest to follow a similar >>>>>>>>>>>> structure here. >>>>>>>>>>>> >>>>>>>>>>>> If the community is in favor of this, I would volunteer to >>>>>>>>>>>> write a first version of such a roadmap. The points I would >>>>>>>>>>>> include are >>>>>>>>>>>> below. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Stephan >>>>>>>>>>>> >>>>>>>>>>>> [1] https://beam.apache.org/roadmap/ >>>>>>>>>>>> >>>>>>>>>>>> ======================================================== >>>>>>>>>>>> >>>>>>>>>>>> Disclaimer: Apache Flink is not governed or steered by any one >>>>>>>>>>>> single entity, but by its community and Project Management >>>>>>>>>>>> Committee (PMC). >>>>>>>>>>>> This is not a authoritative roadmap in the sense of a plan with a >>>>>>>>>>>> specific >>>>>>>>>>>> timeline. Instead, we share our vision for the future and major >>>>>>>>>>>> initiatives >>>>>>>>>>>> that are receiving attention and give users and contributors an >>>>>>>>>>>> understanding what they can look forward to. >>>>>>>>>>>> >>>>>>>>>>>> *Future Role of Table API and DataStream API* >>>>>>>>>>>> - Table API becomes first class citizen >>>>>>>>>>>> - Table API becomes primary API for analytics use cases >>>>>>>>>>>> * Declarative, automatic optimizations >>>>>>>>>>>> * No manual control over state and timers >>>>>>>>>>>> - DataStream API becomes primary API for applications and >>>>>>>>>>>> data pipeline use cases >>>>>>>>>>>> * Physical, user controls data types, no magic or >>>>>>>>>>>> optimizer >>>>>>>>>>>> * Explicit control over state and time >>>>>>>>>>>> >>>>>>>>>>>> *Batch Streaming Unification* >>>>>>>>>>>> - Table API unification (environments) (FLIP-32) >>>>>>>>>>>> - New unified source interface (FLIP-27) >>>>>>>>>>>> - Runtime operator unification & code reuse between >>>>>>>>>>>> DataStream / Table >>>>>>>>>>>> - Extending Table API to make it convenient API for all >>>>>>>>>>>> analytical use cases (easier mix in of UDFs) >>>>>>>>>>>> - Same join operators on bounded/unbounded Table API and >>>>>>>>>>>> DataStream API >>>>>>>>>>>> >>>>>>>>>>>> *Faster Batch (Bounded Streams)* >>>>>>>>>>>> - Much of this comes via Blink contribution/merging >>>>>>>>>>>> - Fine-grained Fault Tolerance on bounded data (Table API) >>>>>>>>>>>> - Batch Scheduling on bounded data (Table API) >>>>>>>>>>>> - External Shuffle Services Support on bounded streams >>>>>>>>>>>> - Caching of intermediate results on bounded data (Table API) >>>>>>>>>>>> - Extending DataStream API to explicitly model bounded >>>>>>>>>>>> streams (API breaking) >>>>>>>>>>>> - Add fine fault tolerance, scheduling, caching also to >>>>>>>>>>>> DataStream API >>>>>>>>>>>> >>>>>>>>>>>> *Streaming State Evolution* >>>>>>>>>>>> - Let all built-in serializers support stable evolution >>>>>>>>>>>> - First class support for other evolvable formats (Protobuf, >>>>>>>>>>>> Thrift) >>>>>>>>>>>> - Savepoint input/output format to modify / adjust savepoints >>>>>>>>>>>> >>>>>>>>>>>> *Simpler Event Time Handling* >>>>>>>>>>>> - Event Time Alignment in Sources >>>>>>>>>>>> - Simpler out-of-the box support in sources >>>>>>>>>>>> >>>>>>>>>>>> *Checkpointing* >>>>>>>>>>>> - Consistency of Side Effects: suspend / end with savepoint >>>>>>>>>>>> (FLIP-34) >>>>>>>>>>>> - Failed checkpoints explicitly aborted on TaskManagers (not >>>>>>>>>>>> only on coordinator) >>>>>>>>>>>> >>>>>>>>>>>> *Automatic scaling (adjusting parallelism)* >>>>>>>>>>>> - Reactive scaling >>>>>>>>>>>> - Active scaling policies >>>>>>>>>>>> >>>>>>>>>>>> *Kubernetes Integration* >>>>>>>>>>>> - Active Kubernetes Integration (Flink actively manages >>>>>>>>>>>> containers) >>>>>>>>>>>> >>>>>>>>>>>> *SQL Ecosystem* >>>>>>>>>>>> - Extended Metadata Stores / Catalog / Schema Registries >>>>>>>>>>>> support >>>>>>>>>>>> - DDL support >>>>>>>>>>>> - Integration with Hive Ecosystem >>>>>>>>>>>> >>>>>>>>>>>> *Simpler Handling of Dependencies* >>>>>>>>>>>> - Scala in the APIs, but not in the core (hide in separate >>>>>>>>>>>> class loader) >>>>>>>>>>>> - Hadoop-free by default >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards >>>>>>>> >>>>>>>> Jeff Zhang >>>>>>>> >>>>>>>