Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Stephan Ewen Tue, 26 Feb 2019 09:29:54 -0800

Here is the pull request with a draft of the roadmap:
https://github.com/apache/flink-web/pull/178


Best,
Stephan

On Fri, Feb 22, 2019 at 5:18 AM Hequn Cheng <chenghe...@gmail.com> wrote:

> Hi Stephan,
>
> Thanks for summarizing the great roadmap! It is very helpful for users and
> developers to track the direction of Flink.
> +1 for putting the roadmap on the website and update it per release.
>
> Besides, would be great if the roadmap can add the UpsertSource
> feature(maybe put it under 'Batch Streaming Unification').
> It has been discussed a long time ago[1,2] and is moving forward step by
> step.
> Currently, Flink can only emit upsert results. With the UpsertSource, we
> can make our system a more complete one.
>
> Best, Hequn
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-TABLE-How-to-handle-empty-delete-for-UpsertSource-td23856.html#a23874
> [2] https://issues.apache.org/jira/browse/FLINK-8545
> <https://issues.apache.org/jira/browse/FLINK-8545>
>
>
>
> On Fri, Feb 22, 2019 at 3:31 AM Rong Rong <walter...@gmail.com> wrote:
>
>> Hi Stephan,
>>
>> Yes. I completely agree. Jincheng & Jark gave some very valuable
>> feedbacks and suggestions and I think we can definitely move the
>> conversation forward to reach a more concrete doc first before we put in to
>> the roadmap. Thanks for reviewing it and driving the roadmap effort!
>>
>> --
>> Rong
>>
>> On Thu, Feb 21, 2019 at 8:50 AM Stephan Ewen <se...@apache.org> wrote:
>>
>>> Hi Rong Rong!
>>>
>>> I would add the security / kerberos threads to the roadmap. They seem to
>>> be advanced enough in the discussions so that there is clarity what will
>>> come.
>>>
>>> For the window operator with slicing, I would personally like to see the
>>> discussion advance and have some more clarity and consensus on the feature
>>> before adding it to the roadmap. Not having that in the first version of
>>> the roadmap does not mean there will be no activity. And when the
>>> discussion advances well in the next weeks, we can update the roadmap soon.
>>>
>>> What do you think?
>>>
>>> Best,
>>> Stephan
>>>
>>>
>>> On Thu, Feb 14, 2019 at 5:46 PM Rong Rong <walter...@gmail.com> wrote:
>>>
>>>> Hi Stephan,
>>>>
>>>> Thanks for the clarification, yes I think these issues has already been
>>>> discussed in previous mailing list threads [1,2,3].
>>>>
>>>> I also agree that updating the "official" roadmap every release is a
>>>> very good idea to avoid frequent update.
>>>> One question I might've been a bit confusion is: are we suggesting to
>>>> keep one roadmap on the documentation site (e.g. [4]) per release, or
>>>> simply just one most up-to-date roadmap in the main website [5] ?
>>>> Just like the release notes in every release, the former will probably
>>>> provide a good tracker for users to look back at previous roadmaps as well
>>>> I am assuming.
>>>>
>>>> Thanks,
>>>> Rong
>>>>
>>>> [1]
>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html
>>>> [2]
>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>>>> [3]
>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html
>>>>
>>>> [4] https://ci.apache.org/projects/flink/flink-docs-release-1.7/
>>>> [5] https://flink.apache.org/
>>>>
>>>> On Thu, Feb 14, 2019 at 2:26 AM Stephan Ewen <se...@apache.org> wrote:
>>>>
>>>>> I think the website is better as well.
>>>>>
>>>>> I agree with Fabian that the wiki is not so visible, and visibility is
>>>>> the main motivation.
>>>>> This type of roadmap overview would not be updated by everyone -
>>>>> letting committers update the roadmap means the listed threads are 
>>>>> actually
>>>>> happening at the moment.
>>>>>
>>>>>
>>>>> On Thu, Feb 14, 2019 at 11:14 AM Fabian Hueske <fhue...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I like the idea of putting the roadmap on the website because it is
>>>>>> much more visible (and IMO more credible, obligatory) there.
>>>>>> However, I share the concerns about frequent updates.
>>>>>>
>>>>>> It think it would be great to update the "official" roadmap on the
>>>>>> website once per release (-bugfix releases), i.e., every three month.
>>>>>> We can use the wiki to collect and draft the roadmap for the next
>>>>>> update.
>>>>>>
>>>>>> Best, Fabian
>>>>>>
>>>>>>
>>>>>> Am Do., 14. Feb. 2019 um 11:03 Uhr schrieb Jeff Zhang <
>>>>>> zjf...@gmail.com>:
>>>>>>
>>>>>>> Hi Stephan,
>>>>>>>
>>>>>>> Thanks for this proposal. It is a good idea to track the roadmap.
>>>>>>> One suggestion is that it might be better to put it into wiki page 
>>>>>>> first.
>>>>>>> Because it is easier to update the roadmap on wiki compared to on flink 
>>>>>>> web
>>>>>>> site. And I guess we may need to update the roadmap very often at the
>>>>>>> beginning as there's so many discussions and proposals in community
>>>>>>> recently. We can move it into flink web site later when we feel it 
>>>>>>> could be
>>>>>>> nailed down.
>>>>>>>
>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月14日周四 下午5:44写道：
>>>>>>>
>>>>>>>> Thanks Jincheng and Rong Rong!
>>>>>>>>
>>>>>>>> I am not deciding a roadmap and making a call on what features
>>>>>>>> should be developed or not. I was only collecting broader issues that 
>>>>>>>> are
>>>>>>>> already happening or have an active FLIP/design discussion plus 
>>>>>>>> committer
>>>>>>>> support.
>>>>>>>>
>>>>>>>> Do we have that for the suggested issues as well? If yes , we can
>>>>>>>> add them (can you point me to the issue/mail-thread), if not, let's 
>>>>>>>> try and
>>>>>>>> move the discussion forward and add them to the roadmap overview then.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Stephan
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks Stephan for the great proposal.
>>>>>>>>>
>>>>>>>>> This would not only be beneficial for new users but also for
>>>>>>>>> contributors to keep track on all upcoming features.
>>>>>>>>>
>>>>>>>>> I think that better window operator support can also be separately
>>>>>>>>> group into its own category, as they affects both future DataStream 
>>>>>>>>> API and
>>>>>>>>> batch stream unification.
>>>>>>>>> can we also include:
>>>>>>>>> - OVER aggregate for DataStream API separately as @jincheng
>>>>>>>>> suggested.
>>>>>>>>> - Improving sliding window operator [1]
>>>>>>>>>
>>>>>>>>> One more additional suggestion, can we also include a more
>>>>>>>>> extendable security module [2,3] @shuyi and I are currently working 
>>>>>>>>> on?
>>>>>>>>> This will significantly improve the usability for Flink in
>>>>>>>>> corporate environments where proprietary or 3rd-party security 
>>>>>>>>> integration
>>>>>>>>> is needed.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Rong
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html
>>>>>>>>> [2]
>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>>>>>>>>> [3]
>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <
>>>>>>>>> sunjincheng...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Very excited and thank you for launching such a great discussion,
>>>>>>>>>> Stephan !
>>>>>>>>>>
>>>>>>>>>> Here only a little suggestion that in the Batch Streaming
>>>>>>>>>> Unification section, do we need to add an item:
>>>>>>>>>>
>>>>>>>>>> - Same window operators on bounded/unbounded Table API and
>>>>>>>>>> DataStream API
>>>>>>>>>> (currently OVER window only exists in SQL/TableAPI, DataStream
>>>>>>>>>> API does not yet support)
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Jincheng
>>>>>>>>>>
>>>>>>>>>> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道：
>>>>>>>>>>
>>>>>>>>>>> Hi all!
>>>>>>>>>>>
>>>>>>>>>>> Recently several contributors, committers, and users asked about
>>>>>>>>>>> making it more visible in which way the project is currently going.
>>>>>>>>>>>
>>>>>>>>>>> Users and developers can track the direction by following the
>>>>>>>>>>> discussion threads and JIRA, but due to the mass of discussions and 
>>>>>>>>>>> open
>>>>>>>>>>> issues, it is very hard to get a good overall picture.
>>>>>>>>>>> Especially for new users and contributors, is is very hard to
>>>>>>>>>>> get a quick overview of the project direction.
>>>>>>>>>>>
>>>>>>>>>>> To fix this, I suggest to add a brief roadmap summary to the
>>>>>>>>>>> homepage. It is a bit of a commitment to keep that roadmap up to 
>>>>>>>>>>> date, but
>>>>>>>>>>> I think the benefit for users justifies that.
>>>>>>>>>>> The Apache Beam project has added such a roadmap [1]
>>>>>>>>>>> <https://beam.apache.org/roadmap/>, which was received very
>>>>>>>>>>> well by the community, I would suggest to follow a similar 
>>>>>>>>>>> structure here.
>>>>>>>>>>>
>>>>>>>>>>> If the community is in favor of this, I would volunteer to write
>>>>>>>>>>> a first version of such a roadmap. The points I would include are 
>>>>>>>>>>> below.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Stephan
>>>>>>>>>>>
>>>>>>>>>>> [1] https://beam.apache.org/roadmap/
>>>>>>>>>>>
>>>>>>>>>>> ========================================================
>>>>>>>>>>>
>>>>>>>>>>> Disclaimer: Apache Flink is not governed or steered by any one
>>>>>>>>>>> single entity, but by its community and Project Management 
>>>>>>>>>>> Committee (PMC).
>>>>>>>>>>> This is not a authoritative roadmap in the sense of a plan with a 
>>>>>>>>>>> specific
>>>>>>>>>>> timeline. Instead, we share our vision for the future and major 
>>>>>>>>>>> initiatives
>>>>>>>>>>> that are receiving attention and give users and contributors an
>>>>>>>>>>> understanding what they can look forward to.
>>>>>>>>>>>
>>>>>>>>>>> *Future Role of Table API and DataStream API*
>>>>>>>>>>>   - Table API becomes first class citizen
>>>>>>>>>>>   - Table API becomes primary API for analytics use cases
>>>>>>>>>>>       * Declarative, automatic optimizations
>>>>>>>>>>>       * No manual control over state and timers
>>>>>>>>>>>   - DataStream API becomes primary API for applications and data
>>>>>>>>>>> pipeline use cases
>>>>>>>>>>>       * Physical, user controls data types, no magic or optimizer
>>>>>>>>>>>       * Explicit control over state and time
>>>>>>>>>>>
>>>>>>>>>>> *Batch Streaming Unification*
>>>>>>>>>>>   - Table API unification (environments) (FLIP-32)
>>>>>>>>>>>   - New unified source interface (FLIP-27)
>>>>>>>>>>>   - Runtime operator unification & code reuse between DataStream
>>>>>>>>>>> / Table
>>>>>>>>>>>   - Extending Table API to make it convenient API for all
>>>>>>>>>>> analytical use cases (easier mix in of UDFs)
>>>>>>>>>>>   - Same join operators on bounded/unbounded Table API and
>>>>>>>>>>> DataStream API
>>>>>>>>>>>
>>>>>>>>>>> *Faster Batch (Bounded Streams)*
>>>>>>>>>>>   - Much of this comes via Blink contribution/merging
>>>>>>>>>>>   - Fine-grained Fault Tolerance on bounded data (Table API)
>>>>>>>>>>>   - Batch Scheduling on bounded data (Table API)
>>>>>>>>>>>   - External Shuffle Services Support on bounded streams
>>>>>>>>>>>   - Caching of intermediate results on bounded data (Table API)
>>>>>>>>>>>   - Extending DataStream API to explicitly model bounded streams
>>>>>>>>>>> (API breaking)
>>>>>>>>>>>   - Add fine fault tolerance, scheduling, caching also to
>>>>>>>>>>> DataStream API
>>>>>>>>>>>
>>>>>>>>>>> *Streaming State Evolution*
>>>>>>>>>>>   - Let all built-in serializers support stable evolution
>>>>>>>>>>>   - First class support for other evolvable formats (Protobuf,
>>>>>>>>>>> Thrift)
>>>>>>>>>>>   - Savepoint input/output format to modify / adjust savepoints
>>>>>>>>>>>
>>>>>>>>>>> *Simpler Event Time Handling*
>>>>>>>>>>>   - Event Time Alignment in Sources
>>>>>>>>>>>   - Simpler out-of-the box support in sources
>>>>>>>>>>>
>>>>>>>>>>> *Checkpointing*
>>>>>>>>>>>   - Consistency of Side Effects: suspend / end with savepoint
>>>>>>>>>>> (FLIP-34)
>>>>>>>>>>>   - Failed checkpoints explicitly aborted on TaskManagers (not
>>>>>>>>>>> only on coordinator)
>>>>>>>>>>>
>>>>>>>>>>> *Automatic scaling (adjusting parallelism)*
>>>>>>>>>>>   - Reactive scaling
>>>>>>>>>>>   - Active scaling policies
>>>>>>>>>>>
>>>>>>>>>>> *Kubernetes Integration*
>>>>>>>>>>>   - Active Kubernetes Integration (Flink actively manages
>>>>>>>>>>> containers)
>>>>>>>>>>>
>>>>>>>>>>> *SQL Ecosystem*
>>>>>>>>>>>   - Extended Metadata Stores / Catalog / Schema Registries
>>>>>>>>>>> support
>>>>>>>>>>>   - DDL support
>>>>>>>>>>>   - Integration with Hive Ecosystem
>>>>>>>>>>>
>>>>>>>>>>> *Simpler Handling of Dependencies*
>>>>>>>>>>>   - Scala in the APIs, but not in the core (hide in separate
>>>>>>>>>>> class loader)
>>>>>>>>>>>   - Hadoop-free by default
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>>
>>>>>>> Jeff Zhang
>>>>>>>
>>>>>>

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Reply via email to