Here is the pull request with a draft of the roadmap: https://github.com/apache/flink-web/pull/178
Best, Stephan On Fri, Feb 22, 2019 at 5:18 AM Hequn Cheng <chenghe...@gmail.com> wrote: > Hi Stephan, > > Thanks for summarizing the great roadmap! It is very helpful for users and > developers to track the direction of Flink. > +1 for putting the roadmap on the website and update it per release. > > Besides, would be great if the roadmap can add the UpsertSource > feature(maybe put it under 'Batch Streaming Unification'). > It has been discussed a long time ago[1,2] and is moving forward step by > step. > Currently, Flink can only emit upsert results. With the UpsertSource, we > can make our system a more complete one. > > Best, Hequn > > [1] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-TABLE-How-to-handle-empty-delete-for-UpsertSource-td23856.html#a23874 > [2] https://issues.apache.org/jira/browse/FLINK-8545 > <https://issues.apache.org/jira/browse/FLINK-8545> > > > > On Fri, Feb 22, 2019 at 3:31 AM Rong Rong <walter...@gmail.com> wrote: > >> Hi Stephan, >> >> Yes. I completely agree. Jincheng & Jark gave some very valuable >> feedbacks and suggestions and I think we can definitely move the >> conversation forward to reach a more concrete doc first before we put in to >> the roadmap. Thanks for reviewing it and driving the roadmap effort! >> >> -- >> Rong >> >> On Thu, Feb 21, 2019 at 8:50 AM Stephan Ewen <se...@apache.org> wrote: >> >>> Hi Rong Rong! >>> >>> I would add the security / kerberos threads to the roadmap. They seem to >>> be advanced enough in the discussions so that there is clarity what will >>> come. >>> >>> For the window operator with slicing, I would personally like to see the >>> discussion advance and have some more clarity and consensus on the feature >>> before adding it to the roadmap. Not having that in the first version of >>> the roadmap does not mean there will be no activity. And when the >>> discussion advances well in the next weeks, we can update the roadmap soon. >>> >>> What do you think? >>> >>> Best, >>> Stephan >>> >>> >>> On Thu, Feb 14, 2019 at 5:46 PM Rong Rong <walter...@gmail.com> wrote: >>> >>>> Hi Stephan, >>>> >>>> Thanks for the clarification, yes I think these issues has already been >>>> discussed in previous mailing list threads [1,2,3]. >>>> >>>> I also agree that updating the "official" roadmap every release is a >>>> very good idea to avoid frequent update. >>>> One question I might've been a bit confusion is: are we suggesting to >>>> keep one roadmap on the documentation site (e.g. [4]) per release, or >>>> simply just one most up-to-date roadmap in the main website [5] ? >>>> Just like the release notes in every release, the former will probably >>>> provide a good tracker for users to look back at previous roadmaps as well >>>> I am assuming. >>>> >>>> Thanks, >>>> Rong >>>> >>>> [1] >>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html >>>> [2] >>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html >>>> [3] >>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html >>>> >>>> [4] https://ci.apache.org/projects/flink/flink-docs-release-1.7/ >>>> [5] https://flink.apache.org/ >>>> >>>> On Thu, Feb 14, 2019 at 2:26 AM Stephan Ewen <se...@apache.org> wrote: >>>> >>>>> I think the website is better as well. >>>>> >>>>> I agree with Fabian that the wiki is not so visible, and visibility is >>>>> the main motivation. >>>>> This type of roadmap overview would not be updated by everyone - >>>>> letting committers update the roadmap means the listed threads are >>>>> actually >>>>> happening at the moment. >>>>> >>>>> >>>>> On Thu, Feb 14, 2019 at 11:14 AM Fabian Hueske <fhue...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I like the idea of putting the roadmap on the website because it is >>>>>> much more visible (and IMO more credible, obligatory) there. >>>>>> However, I share the concerns about frequent updates. >>>>>> >>>>>> It think it would be great to update the "official" roadmap on the >>>>>> website once per release (-bugfix releases), i.e., every three month. >>>>>> We can use the wiki to collect and draft the roadmap for the next >>>>>> update. >>>>>> >>>>>> Best, Fabian >>>>>> >>>>>> >>>>>> Am Do., 14. Feb. 2019 um 11:03 Uhr schrieb Jeff Zhang < >>>>>> zjf...@gmail.com>: >>>>>> >>>>>>> Hi Stephan, >>>>>>> >>>>>>> Thanks for this proposal. It is a good idea to track the roadmap. >>>>>>> One suggestion is that it might be better to put it into wiki page >>>>>>> first. >>>>>>> Because it is easier to update the roadmap on wiki compared to on flink >>>>>>> web >>>>>>> site. And I guess we may need to update the roadmap very often at the >>>>>>> beginning as there's so many discussions and proposals in community >>>>>>> recently. We can move it into flink web site later when we feel it >>>>>>> could be >>>>>>> nailed down. >>>>>>> >>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月14日周四 下午5:44写道: >>>>>>> >>>>>>>> Thanks Jincheng and Rong Rong! >>>>>>>> >>>>>>>> I am not deciding a roadmap and making a call on what features >>>>>>>> should be developed or not. I was only collecting broader issues that >>>>>>>> are >>>>>>>> already happening or have an active FLIP/design discussion plus >>>>>>>> committer >>>>>>>> support. >>>>>>>> >>>>>>>> Do we have that for the suggested issues as well? If yes , we can >>>>>>>> add them (can you point me to the issue/mail-thread), if not, let's >>>>>>>> try and >>>>>>>> move the discussion forward and add them to the roadmap overview then. >>>>>>>> >>>>>>>> Best, >>>>>>>> Stephan >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks Stephan for the great proposal. >>>>>>>>> >>>>>>>>> This would not only be beneficial for new users but also for >>>>>>>>> contributors to keep track on all upcoming features. >>>>>>>>> >>>>>>>>> I think that better window operator support can also be separately >>>>>>>>> group into its own category, as they affects both future DataStream >>>>>>>>> API and >>>>>>>>> batch stream unification. >>>>>>>>> can we also include: >>>>>>>>> - OVER aggregate for DataStream API separately as @jincheng >>>>>>>>> suggested. >>>>>>>>> - Improving sliding window operator [1] >>>>>>>>> >>>>>>>>> One more additional suggestion, can we also include a more >>>>>>>>> extendable security module [2,3] @shuyi and I are currently working >>>>>>>>> on? >>>>>>>>> This will significantly improve the usability for Flink in >>>>>>>>> corporate environments where proprietary or 3rd-party security >>>>>>>>> integration >>>>>>>>> is needed. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Rong >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html >>>>>>>>> [2] >>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html >>>>>>>>> [3] >>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun < >>>>>>>>> sunjincheng...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Very excited and thank you for launching such a great discussion, >>>>>>>>>> Stephan ! >>>>>>>>>> >>>>>>>>>> Here only a little suggestion that in the Batch Streaming >>>>>>>>>> Unification section, do we need to add an item: >>>>>>>>>> >>>>>>>>>> - Same window operators on bounded/unbounded Table API and >>>>>>>>>> DataStream API >>>>>>>>>> (currently OVER window only exists in SQL/TableAPI, DataStream >>>>>>>>>> API does not yet support) >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jincheng >>>>>>>>>> >>>>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道: >>>>>>>>>> >>>>>>>>>>> Hi all! >>>>>>>>>>> >>>>>>>>>>> Recently several contributors, committers, and users asked about >>>>>>>>>>> making it more visible in which way the project is currently going. >>>>>>>>>>> >>>>>>>>>>> Users and developers can track the direction by following the >>>>>>>>>>> discussion threads and JIRA, but due to the mass of discussions and >>>>>>>>>>> open >>>>>>>>>>> issues, it is very hard to get a good overall picture. >>>>>>>>>>> Especially for new users and contributors, is is very hard to >>>>>>>>>>> get a quick overview of the project direction. >>>>>>>>>>> >>>>>>>>>>> To fix this, I suggest to add a brief roadmap summary to the >>>>>>>>>>> homepage. It is a bit of a commitment to keep that roadmap up to >>>>>>>>>>> date, but >>>>>>>>>>> I think the benefit for users justifies that. >>>>>>>>>>> The Apache Beam project has added such a roadmap [1] >>>>>>>>>>> <https://beam.apache.org/roadmap/>, which was received very >>>>>>>>>>> well by the community, I would suggest to follow a similar >>>>>>>>>>> structure here. >>>>>>>>>>> >>>>>>>>>>> If the community is in favor of this, I would volunteer to write >>>>>>>>>>> a first version of such a roadmap. The points I would include are >>>>>>>>>>> below. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Stephan >>>>>>>>>>> >>>>>>>>>>> [1] https://beam.apache.org/roadmap/ >>>>>>>>>>> >>>>>>>>>>> ======================================================== >>>>>>>>>>> >>>>>>>>>>> Disclaimer: Apache Flink is not governed or steered by any one >>>>>>>>>>> single entity, but by its community and Project Management >>>>>>>>>>> Committee (PMC). >>>>>>>>>>> This is not a authoritative roadmap in the sense of a plan with a >>>>>>>>>>> specific >>>>>>>>>>> timeline. Instead, we share our vision for the future and major >>>>>>>>>>> initiatives >>>>>>>>>>> that are receiving attention and give users and contributors an >>>>>>>>>>> understanding what they can look forward to. >>>>>>>>>>> >>>>>>>>>>> *Future Role of Table API and DataStream API* >>>>>>>>>>> - Table API becomes first class citizen >>>>>>>>>>> - Table API becomes primary API for analytics use cases >>>>>>>>>>> * Declarative, automatic optimizations >>>>>>>>>>> * No manual control over state and timers >>>>>>>>>>> - DataStream API becomes primary API for applications and data >>>>>>>>>>> pipeline use cases >>>>>>>>>>> * Physical, user controls data types, no magic or optimizer >>>>>>>>>>> * Explicit control over state and time >>>>>>>>>>> >>>>>>>>>>> *Batch Streaming Unification* >>>>>>>>>>> - Table API unification (environments) (FLIP-32) >>>>>>>>>>> - New unified source interface (FLIP-27) >>>>>>>>>>> - Runtime operator unification & code reuse between DataStream >>>>>>>>>>> / Table >>>>>>>>>>> - Extending Table API to make it convenient API for all >>>>>>>>>>> analytical use cases (easier mix in of UDFs) >>>>>>>>>>> - Same join operators on bounded/unbounded Table API and >>>>>>>>>>> DataStream API >>>>>>>>>>> >>>>>>>>>>> *Faster Batch (Bounded Streams)* >>>>>>>>>>> - Much of this comes via Blink contribution/merging >>>>>>>>>>> - Fine-grained Fault Tolerance on bounded data (Table API) >>>>>>>>>>> - Batch Scheduling on bounded data (Table API) >>>>>>>>>>> - External Shuffle Services Support on bounded streams >>>>>>>>>>> - Caching of intermediate results on bounded data (Table API) >>>>>>>>>>> - Extending DataStream API to explicitly model bounded streams >>>>>>>>>>> (API breaking) >>>>>>>>>>> - Add fine fault tolerance, scheduling, caching also to >>>>>>>>>>> DataStream API >>>>>>>>>>> >>>>>>>>>>> *Streaming State Evolution* >>>>>>>>>>> - Let all built-in serializers support stable evolution >>>>>>>>>>> - First class support for other evolvable formats (Protobuf, >>>>>>>>>>> Thrift) >>>>>>>>>>> - Savepoint input/output format to modify / adjust savepoints >>>>>>>>>>> >>>>>>>>>>> *Simpler Event Time Handling* >>>>>>>>>>> - Event Time Alignment in Sources >>>>>>>>>>> - Simpler out-of-the box support in sources >>>>>>>>>>> >>>>>>>>>>> *Checkpointing* >>>>>>>>>>> - Consistency of Side Effects: suspend / end with savepoint >>>>>>>>>>> (FLIP-34) >>>>>>>>>>> - Failed checkpoints explicitly aborted on TaskManagers (not >>>>>>>>>>> only on coordinator) >>>>>>>>>>> >>>>>>>>>>> *Automatic scaling (adjusting parallelism)* >>>>>>>>>>> - Reactive scaling >>>>>>>>>>> - Active scaling policies >>>>>>>>>>> >>>>>>>>>>> *Kubernetes Integration* >>>>>>>>>>> - Active Kubernetes Integration (Flink actively manages >>>>>>>>>>> containers) >>>>>>>>>>> >>>>>>>>>>> *SQL Ecosystem* >>>>>>>>>>> - Extended Metadata Stores / Catalog / Schema Registries >>>>>>>>>>> support >>>>>>>>>>> - DDL support >>>>>>>>>>> - Integration with Hive Ecosystem >>>>>>>>>>> >>>>>>>>>>> *Simpler Handling of Dependencies* >>>>>>>>>>> - Scala in the APIs, but not in the core (hide in separate >>>>>>>>>>> class loader) >>>>>>>>>>> - Hadoop-free by default >>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards >>>>>>> >>>>>>> Jeff Zhang >>>>>>> >>>>>>