Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Stephan Ewen Thu, 14 Feb 2019 01:45:13 -0800

Thanks Jincheng and Rong Rong!

I am not deciding a roadmap and making a call on what features should be
developed or not. I was only collecting broader issues that are already
happening or have an active FLIP/design discussion plus committer support.


Do we have that for the suggested issues as well? If yes , we can add them
(can you point me to the issue/mail-thread), if not, let's try and move the
discussion forward and add them to the roadmap overview then.

Best,
Stephan


On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com> wrote:

> Thanks Stephan for the great proposal.
>
> This would not only be beneficial for new users but also for contributors
> to keep track on all upcoming features.
>
> I think that better window operator support can also be separately group
> into its own category, as they affects both future DataStream API and batch
> stream unification.
> can we also include:
> - OVER aggregate for DataStream API separately as @jincheng suggested.
> - Improving sliding window operator [1]
>
> One more additional suggestion, can we also include a more extendable
> security module [2,3] @shuyi and I are currently working on?
> This will significantly improve the usability for Flink in corporate
> environments where proprietary or 3rd-party security integration is needed.
>
> Thanks,
> Rong
>
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
> [3]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html
>
>
>
>
> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <sunjincheng...@gmail.com>
> wrote:
>
>> Very excited and thank you for launching such a great discussion, Stephan
>> !
>>
>> Here only a little suggestion that in the Batch Streaming Unification
>> section, do we need to add an item:
>>
>> - Same window operators on bounded/unbounded Table API and DataStream API
>> (currently OVER window only exists in SQL/TableAPI, DataStream API does
>> not yet support)
>>
>> Best,
>> Jincheng
>>
>> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道：
>>
>>> Hi all!
>>>
>>> Recently several contributors, committers, and users asked about making
>>> it more visible in which way the project is currently going.
>>>
>>> Users and developers can track the direction by following the discussion
>>> threads and JIRA, but due to the mass of discussions and open issues, it is
>>> very hard to get a good overall picture.
>>> Especially for new users and contributors, is is very hard to get a
>>> quick overview of the project direction.
>>>
>>> To fix this, I suggest to add a brief roadmap summary to the homepage.
>>> It is a bit of a commitment to keep that roadmap up to date, but I think
>>> the benefit for users justifies that.
>>> The Apache Beam project has added such a roadmap [1]
>>> <https://beam.apache.org/roadmap/>, which was received very well by the
>>> community, I would suggest to follow a similar structure here.
>>>
>>> If the community is in favor of this, I would volunteer to write a first
>>> version of such a roadmap. The points I would include are below.
>>>
>>> Best,
>>> Stephan
>>>
>>> [1] https://beam.apache.org/roadmap/
>>>
>>> ========================================================
>>>
>>> Disclaimer: Apache Flink is not governed or steered by any one single
>>> entity, but by its community and Project Management Committee (PMC). This
>>> is not a authoritative roadmap in the sense of a plan with a specific
>>> timeline. Instead, we share our vision for the future and major initiatives
>>> that are receiving attention and give users and contributors an
>>> understanding what they can look forward to.
>>>
>>> *Future Role of Table API and DataStream API*
>>>   - Table API becomes first class citizen
>>>   - Table API becomes primary API for analytics use cases
>>>       * Declarative, automatic optimizations
>>>       * No manual control over state and timers
>>>   - DataStream API becomes primary API for applications and data
>>> pipeline use cases
>>>       * Physical, user controls data types, no magic or optimizer
>>>       * Explicit control over state and time
>>>
>>> *Batch Streaming Unification*
>>>   - Table API unification (environments) (FLIP-32)
>>>   - New unified source interface (FLIP-27)
>>>   - Runtime operator unification & code reuse between DataStream / Table
>>>   - Extending Table API to make it convenient API for all analytical use
>>> cases (easier mix in of UDFs)
>>>   - Same join operators on bounded/unbounded Table API and DataStream API
>>>
>>> *Faster Batch (Bounded Streams)*
>>>   - Much of this comes via Blink contribution/merging
>>>   - Fine-grained Fault Tolerance on bounded data (Table API)
>>>   - Batch Scheduling on bounded data (Table API)
>>>   - External Shuffle Services Support on bounded streams
>>>   - Caching of intermediate results on bounded data (Table API)
>>>   - Extending DataStream API to explicitly model bounded streams (API
>>> breaking)
>>>   - Add fine fault tolerance, scheduling, caching also to DataStream API
>>>
>>> *Streaming State Evolution*
>>>   - Let all built-in serializers support stable evolution
>>>   - First class support for other evolvable formats (Protobuf, Thrift)
>>>   - Savepoint input/output format to modify / adjust savepoints
>>>
>>> *Simpler Event Time Handling*
>>>   - Event Time Alignment in Sources
>>>   - Simpler out-of-the box support in sources
>>>
>>> *Checkpointing*
>>>   - Consistency of Side Effects: suspend / end with savepoint (FLIP-34)
>>>   - Failed checkpoints explicitly aborted on TaskManagers (not only on
>>> coordinator)
>>>
>>> *Automatic scaling (adjusting parallelism)*
>>>   - Reactive scaling
>>>   - Active scaling policies
>>>
>>> *Kubernetes Integration*
>>>   - Active Kubernetes Integration (Flink actively manages containers)
>>>
>>> *SQL Ecosystem*
>>>   - Extended Metadata Stores / Catalog / Schema Registries support
>>>   - DDL support
>>>   - Integration with Hive Ecosystem
>>>
>>> *Simpler Handling of Dependencies*
>>>   - Scala in the APIs, but not in the core (hide in separate class
>>> loader)
>>>   - Hadoop-free by default
>>>
>>>

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Reply via email to