[DISCUSS] Features for Apache Flink 1.9.0

Tzu-Li (Gordon) Tai Tue, 30 Apr 2019 22:15:48 -0700

Hi community,

Apache Flink 1.8.0 has been released a few weeks ago, so naturally, it’s
time to start thinking about what we want to aim for 1.9.0.


Kurt and I had collected some features that would be reasonable to consider
including for the next release, based on talking with various people as
well as observations from mailing list discussions and questions.

Note that having specific features listed here does not mean that no other
pull requests or topics will be reviewed. I am sure that there are other
ongoing efforts that we missed here and will likely make it as an
improvement or new feature in the next release. This discussion is merely
for bootstrapping a discussion for 1.9, as well as to give contributors an
idea of what the community is looking to focus on in the next couple of
weeks.

*Proposed features and focus*

In the previous major release, Apache Flink 1.8.0, the community had
prepared for some major Table & SQL additions from the Blink branch. With
this in mind, for the next release, it would be great to wind up those
efforts by merging in the Blink-based Table / SQL planner and runtime for
1.9.

Following Stephan’s previous thread [1] in the mailing list about features
in Blink, we should also start focusing on preparing for Blink’s other
several enhancements for batch execution. This includes resource
optimization, fine-grained failover, pluggable shuffle service, adapting
stream operators for batch execution, as well as better integration with
commonly used systems by batch executions such as Apache Hive.

Moreover, besides efforts related to the Blink merge, we would also like us
to work towards pushing forward some of the most discussed and anticipated
features by the community. Most of these had discussions in the mailing
lists that span multiple releases, and are also frequently brought up in
community events such as Flink Forward. This includes features such as
source event-time alignment and the source interface rework, a savepoint
connector that allows users to manipulate and query state in savepoints,
interactive programming, as well as terminating a job with a final
savepoint.

Last but not least, we have several existing contributions or discussions
for the ecosystem surrounding Flink, which we think is also very valuable
to try to merge in for 1.9. This includes a web UI rework (recently already
merged), active K8s integration, Google PubSub connector, native support
for the Protobuf format, Python support in the Table API, as well as
reworking Flink’s support for machine learning.

To wrap this up as a list of items, some of which already have JIRAs or
mailing list threads to track them:

   - Merge Blink runner for Table & SQL [2]
   -

      Restructure flink-table to separate API from core runtime
      -

      Make table planners pluggable
      -

      Rework Table / SQL type system to integrate better with the SQL
      standard [3]
      -

      Merge Blink planner and runtime for Table / SQL
      - Further preparations for more batch execution optimization from
   Blink
   -

      Dedicated scheduler component [4]
      -

      Fine grained failover for batch [5]
      -

      Selectable input stream operator [6]
      -

      Pluggable Shuffle Service [7]
      -

      FLIP-30: Unified Catalog API & Hive metastore integration [8]
      - Heavily anticipated / discussed features in the community
   -

      FLIP-27: Source interface rework [9]
      -

      Savepoint connector [10]
      -

      FLIP-34: Terminate / Suspend job with savepoint [11]
      -

      FLIP-36: Interactive Programming [12]
      - Ecosystem
   -

      Web UI rework [13]
      -

      Active K8s integration [14]
      -

      Google PubSub connector [15]
      -

      First-class Protobuf support [16]
      -

      FLIP-38: Python support in Table API [17]
      -

      FLIP-39: Flink ML pipeline and libraries on top of Table API [18]

*Suggested release timeline*

Apache Flink 1.8.0 was released earlier this month, so based on our usual
timely release schedule, we should aim for releasing 1.9.0 around mid to
end July.

Since it seems that this is going to be a fairly large release, to give the
community enough testing time, I propose that the feature freeze to be near
the end of June (8-9 weeks from now, probable June 28). This is of course a
ballpark estimation for now; we should follow-up with a separate thread
later in the release cycle to prepare contributors with an official feature
freeze date.

I’d also like to use this opportunity to propose myself and Kurt as the
release managers for 1.9.
AFAIK, we did not used to have 2 RMs for a single release in the past, but
1.9.0 is definitely quite ambitious so it would not hurt to have one more
on board :) Cheers, Gordon [1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-6-features-td22632.html

[2] https://issues.apache.org/jira/browse/FLINK-11439

[3] https://issues.apache.org/jira/browse/FLINK-12251

[4] https://issues.apache.org/jira/browse/FLINK-10429

[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Backtracking-for-failover-regions-td28293.html

[6]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Enhance-Operator-API-to-Support-Dynamically-Selective-Reading-and-EndOfInput-Event-td26753.html

[7] https://issues.apache.org/jira/browse/FLINK-10653

[8] https://issues.apache.org/jira/browse/FLINK-11275

[9]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952i20.html

[10] https://issues.apache.org/jira/browse/FLINK-12047

[11]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-33-Terminate-Suspend-Job-with-Savepoint-td26927.html

[12]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink

[13] https://issues.apache.org/jira/browse/FLINK-10705

[14] https://issues.apache.org/jira/browse/FLINK-9953

[15] https://issues.apache.org/jira/browse/FLINK-9311

[16] https://issues.apache.org/jira/browse/FLINK-11333

[17]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html
[18]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-39-Flink-ML-pipeline-and-ML-libs-td28633.html

[DISCUSS] Features for Apache Flink 1.9.0

Reply via email to