Thanks Jincheng and Rong Rong! I am not deciding a roadmap and making a call on what features should be developed or not. I was only collecting broader issues that are already happening or have an active FLIP/design discussion plus committer support.
Do we have that for the suggested issues as well? If yes , we can add them (can you point me to the issue/mail-thread), if not, let's try and move the discussion forward and add them to the roadmap overview then. Best, Stephan On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com> wrote: > Thanks Stephan for the great proposal. > > This would not only be beneficial for new users but also for contributors > to keep track on all upcoming features. > > I think that better window operator support can also be separately group > into its own category, as they affects both future DataStream API and batch > stream unification. > can we also include: > - OVER aggregate for DataStream API separately as @jincheng suggested. > - Improving sliding window operator [1] > > One more additional suggestion, can we also include a more extendable > security module [2,3] @shuyi and I are currently working on? > This will significantly improve the usability for Flink in corporate > environments where proprietary or 3rd-party security integration is needed. > > Thanks, > Rong > > > [1] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html > [2] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html > [3] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html > > > > > On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <sunjincheng...@gmail.com> > wrote: > >> Very excited and thank you for launching such a great discussion, Stephan >> ! >> >> Here only a little suggestion that in the Batch Streaming Unification >> section, do we need to add an item: >> >> - Same window operators on bounded/unbounded Table API and DataStream API >> (currently OVER window only exists in SQL/TableAPI, DataStream API does >> not yet support) >> >> Best, >> Jincheng >> >> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道: >> >>> Hi all! >>> >>> Recently several contributors, committers, and users asked about making >>> it more visible in which way the project is currently going. >>> >>> Users and developers can track the direction by following the discussion >>> threads and JIRA, but due to the mass of discussions and open issues, it is >>> very hard to get a good overall picture. >>> Especially for new users and contributors, is is very hard to get a >>> quick overview of the project direction. >>> >>> To fix this, I suggest to add a brief roadmap summary to the homepage. >>> It is a bit of a commitment to keep that roadmap up to date, but I think >>> the benefit for users justifies that. >>> The Apache Beam project has added such a roadmap [1] >>> <https://beam.apache.org/roadmap/>, which was received very well by the >>> community, I would suggest to follow a similar structure here. >>> >>> If the community is in favor of this, I would volunteer to write a first >>> version of such a roadmap. The points I would include are below. >>> >>> Best, >>> Stephan >>> >>> [1] https://beam.apache.org/roadmap/ >>> >>> ======================================================== >>> >>> Disclaimer: Apache Flink is not governed or steered by any one single >>> entity, but by its community and Project Management Committee (PMC). This >>> is not a authoritative roadmap in the sense of a plan with a specific >>> timeline. Instead, we share our vision for the future and major initiatives >>> that are receiving attention and give users and contributors an >>> understanding what they can look forward to. >>> >>> *Future Role of Table API and DataStream API* >>> - Table API becomes first class citizen >>> - Table API becomes primary API for analytics use cases >>> * Declarative, automatic optimizations >>> * No manual control over state and timers >>> - DataStream API becomes primary API for applications and data >>> pipeline use cases >>> * Physical, user controls data types, no magic or optimizer >>> * Explicit control over state and time >>> >>> *Batch Streaming Unification* >>> - Table API unification (environments) (FLIP-32) >>> - New unified source interface (FLIP-27) >>> - Runtime operator unification & code reuse between DataStream / Table >>> - Extending Table API to make it convenient API for all analytical use >>> cases (easier mix in of UDFs) >>> - Same join operators on bounded/unbounded Table API and DataStream API >>> >>> *Faster Batch (Bounded Streams)* >>> - Much of this comes via Blink contribution/merging >>> - Fine-grained Fault Tolerance on bounded data (Table API) >>> - Batch Scheduling on bounded data (Table API) >>> - External Shuffle Services Support on bounded streams >>> - Caching of intermediate results on bounded data (Table API) >>> - Extending DataStream API to explicitly model bounded streams (API >>> breaking) >>> - Add fine fault tolerance, scheduling, caching also to DataStream API >>> >>> *Streaming State Evolution* >>> - Let all built-in serializers support stable evolution >>> - First class support for other evolvable formats (Protobuf, Thrift) >>> - Savepoint input/output format to modify / adjust savepoints >>> >>> *Simpler Event Time Handling* >>> - Event Time Alignment in Sources >>> - Simpler out-of-the box support in sources >>> >>> *Checkpointing* >>> - Consistency of Side Effects: suspend / end with savepoint (FLIP-34) >>> - Failed checkpoints explicitly aborted on TaskManagers (not only on >>> coordinator) >>> >>> *Automatic scaling (adjusting parallelism)* >>> - Reactive scaling >>> - Active scaling policies >>> >>> *Kubernetes Integration* >>> - Active Kubernetes Integration (Flink actively manages containers) >>> >>> *SQL Ecosystem* >>> - Extended Metadata Stores / Catalog / Schema Registries support >>> - DDL support >>> - Integration with Hive Ecosystem >>> >>> *Simpler Handling of Dependencies* >>> - Scala in the APIs, but not in the core (hide in separate class >>> loader) >>> - Hadoop-free by default >>> >>>