Great job. Stephan! Best, Vino
Jamie Grier <jgr...@lyft.com> 于2019年2月27日周三 上午2:27写道: > This is awesome, Stephan! Thanks for doing this. > > -Jamie > > > On Tue, Feb 26, 2019 at 9:29 AM Stephan Ewen <se...@apache.org> wrote: > >> Here is the pull request with a draft of the roadmap: >> https://github.com/apache/flink-web/pull/178 >> >> Best, >> Stephan >> >> On Fri, Feb 22, 2019 at 5:18 AM Hequn Cheng <chenghe...@gmail.com> wrote: >> >>> Hi Stephan, >>> >>> Thanks for summarizing the great roadmap! It is very helpful for users >>> and developers to track the direction of Flink. >>> +1 for putting the roadmap on the website and update it per release. >>> >>> Besides, would be great if the roadmap can add the UpsertSource >>> feature(maybe put it under 'Batch Streaming Unification'). >>> It has been discussed a long time ago[1,2] and is moving forward step by >>> step. >>> Currently, Flink can only emit upsert results. With the UpsertSource, we >>> can make our system a more complete one. >>> >>> Best, Hequn >>> >>> [1] >>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-TABLE-How-to-handle-empty-delete-for-UpsertSource-td23856.html#a23874 >>> [2] https://issues.apache.org/jira/browse/FLINK-8545 >>> <https://issues.apache.org/jira/browse/FLINK-8545> >>> >>> >>> >>> On Fri, Feb 22, 2019 at 3:31 AM Rong Rong <walter...@gmail.com> wrote: >>> >>>> Hi Stephan, >>>> >>>> Yes. I completely agree. Jincheng & Jark gave some very valuable >>>> feedbacks and suggestions and I think we can definitely move the >>>> conversation forward to reach a more concrete doc first before we put in to >>>> the roadmap. Thanks for reviewing it and driving the roadmap effort! >>>> >>>> -- >>>> Rong >>>> >>>> On Thu, Feb 21, 2019 at 8:50 AM Stephan Ewen <se...@apache.org> wrote: >>>> >>>>> Hi Rong Rong! >>>>> >>>>> I would add the security / kerberos threads to the roadmap. They seem >>>>> to be advanced enough in the discussions so that there is clarity what >>>>> will >>>>> come. >>>>> >>>>> For the window operator with slicing, I would personally like to see >>>>> the discussion advance and have some more clarity and consensus on the >>>>> feature before adding it to the roadmap. Not having that in the first >>>>> version of the roadmap does not mean there will be no activity. And when >>>>> the discussion advances well in the next weeks, we can update the roadmap >>>>> soon. >>>>> >>>>> What do you think? >>>>> >>>>> Best, >>>>> Stephan >>>>> >>>>> >>>>> On Thu, Feb 14, 2019 at 5:46 PM Rong Rong <walter...@gmail.com> wrote: >>>>> >>>>>> Hi Stephan, >>>>>> >>>>>> Thanks for the clarification, yes I think these issues has already >>>>>> been discussed in previous mailing list threads [1,2,3]. >>>>>> >>>>>> I also agree that updating the "official" roadmap every release is a >>>>>> very good idea to avoid frequent update. >>>>>> One question I might've been a bit confusion is: are we suggesting to >>>>>> keep one roadmap on the documentation site (e.g. [4]) per release, or >>>>>> simply just one most up-to-date roadmap in the main website [5] ? >>>>>> Just like the release notes in every release, the former will >>>>>> probably provide a good tracker for users to look back at previous >>>>>> roadmaps >>>>>> as well I am assuming. >>>>>> >>>>>> Thanks, >>>>>> Rong >>>>>> >>>>>> [1] >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html >>>>>> [2] >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html >>>>>> [3] >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html >>>>>> >>>>>> [4] https://ci.apache.org/projects/flink/flink-docs-release-1.7/ >>>>>> [5] https://flink.apache.org/ >>>>>> >>>>>> On Thu, Feb 14, 2019 at 2:26 AM Stephan Ewen <se...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> I think the website is better as well. >>>>>>> >>>>>>> I agree with Fabian that the wiki is not so visible, and visibility >>>>>>> is the main motivation. >>>>>>> This type of roadmap overview would not be updated by everyone - >>>>>>> letting committers update the roadmap means the listed threads are >>>>>>> actually >>>>>>> happening at the moment. >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 14, 2019 at 11:14 AM Fabian Hueske <fhue...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I like the idea of putting the roadmap on the website because it is >>>>>>>> much more visible (and IMO more credible, obligatory) there. >>>>>>>> However, I share the concerns about frequent updates. >>>>>>>> >>>>>>>> It think it would be great to update the "official" roadmap on the >>>>>>>> website once per release (-bugfix releases), i.e., every three month. >>>>>>>> We can use the wiki to collect and draft the roadmap for the next >>>>>>>> update. >>>>>>>> >>>>>>>> Best, Fabian >>>>>>>> >>>>>>>> >>>>>>>> Am Do., 14. Feb. 2019 um 11:03 Uhr schrieb Jeff Zhang < >>>>>>>> zjf...@gmail.com>: >>>>>>>> >>>>>>>>> Hi Stephan, >>>>>>>>> >>>>>>>>> Thanks for this proposal. It is a good idea to track the roadmap. >>>>>>>>> One suggestion is that it might be better to put it into wiki page >>>>>>>>> first. >>>>>>>>> Because it is easier to update the roadmap on wiki compared to on >>>>>>>>> flink web >>>>>>>>> site. And I guess we may need to update the roadmap very often at the >>>>>>>>> beginning as there's so many discussions and proposals in community >>>>>>>>> recently. We can move it into flink web site later when we feel it >>>>>>>>> could be >>>>>>>>> nailed down. >>>>>>>>> >>>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月14日周四 下午5:44写道: >>>>>>>>> >>>>>>>>>> Thanks Jincheng and Rong Rong! >>>>>>>>>> >>>>>>>>>> I am not deciding a roadmap and making a call on what features >>>>>>>>>> should be developed or not. I was only collecting broader issues >>>>>>>>>> that are >>>>>>>>>> already happening or have an active FLIP/design discussion plus >>>>>>>>>> committer >>>>>>>>>> support. >>>>>>>>>> >>>>>>>>>> Do we have that for the suggested issues as well? If yes , we can >>>>>>>>>> add them (can you point me to the issue/mail-thread), if not, let's >>>>>>>>>> try and >>>>>>>>>> move the discussion forward and add them to the roadmap overview >>>>>>>>>> then. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Stephan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks Stephan for the great proposal. >>>>>>>>>>> >>>>>>>>>>> This would not only be beneficial for new users but also for >>>>>>>>>>> contributors to keep track on all upcoming features. >>>>>>>>>>> >>>>>>>>>>> I think that better window operator support can also be >>>>>>>>>>> separately group into its own category, as they affects both future >>>>>>>>>>> DataStream API and batch stream unification. >>>>>>>>>>> can we also include: >>>>>>>>>>> - OVER aggregate for DataStream API separately as @jincheng >>>>>>>>>>> suggested. >>>>>>>>>>> - Improving sliding window operator [1] >>>>>>>>>>> >>>>>>>>>>> One more additional suggestion, can we also include a more >>>>>>>>>>> extendable security module [2,3] @shuyi and I are currently working >>>>>>>>>>> on? >>>>>>>>>>> This will significantly improve the usability for Flink in >>>>>>>>>>> corporate environments where proprietary or 3rd-party security >>>>>>>>>>> integration >>>>>>>>>>> is needed. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Rong >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html >>>>>>>>>>> [2] >>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html >>>>>>>>>>> [3] >>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun < >>>>>>>>>>> sunjincheng...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Very excited and thank you for launching such a great >>>>>>>>>>>> discussion, Stephan ! >>>>>>>>>>>> >>>>>>>>>>>> Here only a little suggestion that in the Batch Streaming >>>>>>>>>>>> Unification section, do we need to add an item: >>>>>>>>>>>> >>>>>>>>>>>> - Same window operators on bounded/unbounded Table API and >>>>>>>>>>>> DataStream API >>>>>>>>>>>> (currently OVER window only exists in SQL/TableAPI, DataStream >>>>>>>>>>>> API does not yet support) >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Jincheng >>>>>>>>>>>> >>>>>>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道: >>>>>>>>>>>> >>>>>>>>>>>>> Hi all! >>>>>>>>>>>>> >>>>>>>>>>>>> Recently several contributors, committers, and users asked >>>>>>>>>>>>> about making it more visible in which way the project is >>>>>>>>>>>>> currently going. >>>>>>>>>>>>> >>>>>>>>>>>>> Users and developers can track the direction by following the >>>>>>>>>>>>> discussion threads and JIRA, but due to the mass of discussions >>>>>>>>>>>>> and open >>>>>>>>>>>>> issues, it is very hard to get a good overall picture. >>>>>>>>>>>>> Especially for new users and contributors, is is very hard to >>>>>>>>>>>>> get a quick overview of the project direction. >>>>>>>>>>>>> >>>>>>>>>>>>> To fix this, I suggest to add a brief roadmap summary to the >>>>>>>>>>>>> homepage. It is a bit of a commitment to keep that roadmap up to >>>>>>>>>>>>> date, but >>>>>>>>>>>>> I think the benefit for users justifies that. >>>>>>>>>>>>> The Apache Beam project has added such a roadmap [1] >>>>>>>>>>>>> <https://beam.apache.org/roadmap/>, which was received very >>>>>>>>>>>>> well by the community, I would suggest to follow a similar >>>>>>>>>>>>> structure here. >>>>>>>>>>>>> >>>>>>>>>>>>> If the community is in favor of this, I would volunteer to >>>>>>>>>>>>> write a first version of such a roadmap. The points I would >>>>>>>>>>>>> include are >>>>>>>>>>>>> below. >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Stephan >>>>>>>>>>>>> >>>>>>>>>>>>> [1] https://beam.apache.org/roadmap/ >>>>>>>>>>>>> >>>>>>>>>>>>> ======================================================== >>>>>>>>>>>>> >>>>>>>>>>>>> Disclaimer: Apache Flink is not governed or steered by any one >>>>>>>>>>>>> single entity, but by its community and Project Management >>>>>>>>>>>>> Committee (PMC). >>>>>>>>>>>>> This is not a authoritative roadmap in the sense of a plan with a >>>>>>>>>>>>> specific >>>>>>>>>>>>> timeline. Instead, we share our vision for the future and major >>>>>>>>>>>>> initiatives >>>>>>>>>>>>> that are receiving attention and give users and contributors an >>>>>>>>>>>>> understanding what they can look forward to. >>>>>>>>>>>>> >>>>>>>>>>>>> *Future Role of Table API and DataStream API* >>>>>>>>>>>>> - Table API becomes first class citizen >>>>>>>>>>>>> - Table API becomes primary API for analytics use cases >>>>>>>>>>>>> * Declarative, automatic optimizations >>>>>>>>>>>>> * No manual control over state and timers >>>>>>>>>>>>> - DataStream API becomes primary API for applications and >>>>>>>>>>>>> data pipeline use cases >>>>>>>>>>>>> * Physical, user controls data types, no magic or >>>>>>>>>>>>> optimizer >>>>>>>>>>>>> * Explicit control over state and time >>>>>>>>>>>>> >>>>>>>>>>>>> *Batch Streaming Unification* >>>>>>>>>>>>> - Table API unification (environments) (FLIP-32) >>>>>>>>>>>>> - New unified source interface (FLIP-27) >>>>>>>>>>>>> - Runtime operator unification & code reuse between >>>>>>>>>>>>> DataStream / Table >>>>>>>>>>>>> - Extending Table API to make it convenient API for all >>>>>>>>>>>>> analytical use cases (easier mix in of UDFs) >>>>>>>>>>>>> - Same join operators on bounded/unbounded Table API and >>>>>>>>>>>>> DataStream API >>>>>>>>>>>>> >>>>>>>>>>>>> *Faster Batch (Bounded Streams)* >>>>>>>>>>>>> - Much of this comes via Blink contribution/merging >>>>>>>>>>>>> - Fine-grained Fault Tolerance on bounded data (Table API) >>>>>>>>>>>>> - Batch Scheduling on bounded data (Table API) >>>>>>>>>>>>> - External Shuffle Services Support on bounded streams >>>>>>>>>>>>> - Caching of intermediate results on bounded data (Table API) >>>>>>>>>>>>> - Extending DataStream API to explicitly model bounded >>>>>>>>>>>>> streams (API breaking) >>>>>>>>>>>>> - Add fine fault tolerance, scheduling, caching also to >>>>>>>>>>>>> DataStream API >>>>>>>>>>>>> >>>>>>>>>>>>> *Streaming State Evolution* >>>>>>>>>>>>> - Let all built-in serializers support stable evolution >>>>>>>>>>>>> - First class support for other evolvable formats (Protobuf, >>>>>>>>>>>>> Thrift) >>>>>>>>>>>>> - Savepoint input/output format to modify / adjust savepoints >>>>>>>>>>>>> >>>>>>>>>>>>> *Simpler Event Time Handling* >>>>>>>>>>>>> - Event Time Alignment in Sources >>>>>>>>>>>>> - Simpler out-of-the box support in sources >>>>>>>>>>>>> >>>>>>>>>>>>> *Checkpointing* >>>>>>>>>>>>> - Consistency of Side Effects: suspend / end with savepoint >>>>>>>>>>>>> (FLIP-34) >>>>>>>>>>>>> - Failed checkpoints explicitly aborted on TaskManagers (not >>>>>>>>>>>>> only on coordinator) >>>>>>>>>>>>> >>>>>>>>>>>>> *Automatic scaling (adjusting parallelism)* >>>>>>>>>>>>> - Reactive scaling >>>>>>>>>>>>> - Active scaling policies >>>>>>>>>>>>> >>>>>>>>>>>>> *Kubernetes Integration* >>>>>>>>>>>>> - Active Kubernetes Integration (Flink actively manages >>>>>>>>>>>>> containers) >>>>>>>>>>>>> >>>>>>>>>>>>> *SQL Ecosystem* >>>>>>>>>>>>> - Extended Metadata Stores / Catalog / Schema Registries >>>>>>>>>>>>> support >>>>>>>>>>>>> - DDL support >>>>>>>>>>>>> - Integration with Hive Ecosystem >>>>>>>>>>>>> >>>>>>>>>>>>> *Simpler Handling of Dependencies* >>>>>>>>>>>>> - Scala in the APIs, but not in the core (hide in separate >>>>>>>>>>>>> class loader) >>>>>>>>>>>>> - Hadoop-free by default >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards >>>>>>>>> >>>>>>>>> Jeff Zhang >>>>>>>>> >>>>>>>>