Thanks for writing this up, this also reflects my understanding.

I think a blog post would be nice, ideally with an explicit call for
feedback so we learn about user concerns.
A blog post has a lot more reach than an ML thread.

Best,
Stephan


On Wed, Jun 23, 2021 at 12:23 PM Timo Walther <twal...@apache.org> wrote:

> Hi everyone,
>
> I'm sending this email to make sure everyone is on the same page about
> slowly deprecating the DataSet API.
>
> There have been a few thoughts mentioned in presentations, offline
> discussions, and JIRA issues. However, I have observed that there are
> still some concerns or different opinions on what steps are necessary to
> implement this change.
>
> Let me summarize some of the steps and assumpations and let's have a
> discussion about it:
>
> Step 1: Introduce a batch mode for Table API (FLIP-32)
> [DONE in 1.9]
>
> Step 2: Introduce a batch mode for DataStream API (FLIP-134)
> [DONE in 1.12]
>
> Step 3: Soft deprecate DataSet API (FLIP-131)
> [DONE in 1.12]
>
> We updated the documentation recently to make this deprecation even more
> visible. There is a dedicated `(Legacy)` label right next to the menu
> item now.
>
> We won't deprecate concrete classes of the API with a @Deprecated
> annotation to avoid extensive warnings in logs until then.
>
> Step 4: Drop the legacy SQL connectors and formats (FLINK-14437)
> [DONE in 1.14]
>
> We dropped code for ORC, Parque, and HBase formats that were only used
> by DataSet API users. The removed classes had no documentation and were
> not annotated with one of our API stability annotations.
>
> The old functionality should be available through the new sources and
> sinks for Table API and DataStream API. If not, we should bring them
> into a shape that they can be a full replacement.
>
> DataSet users are encouraged to either upgrade the API or use Flink
> 1.13. Users can either just stay at Flink 1.13 or copy only the format's
> code to a newer Flink version. We aim to keep the core interfaces (i.e.
> InputFormat and OutputFormat) stable until the next major version.
>
> We will maintain/allow important contributions to dropped connectors in
> 1.13. So 1.13 could be considered as kind of a DataSet API LTS release.
>
> Step 5: Drop the legacy SQL planner (FLINK-14437)
> [DONE in 1.14]
>
> This included dropping support of DataSet API with SQL.
>
> Step 6: Connect both Table and DataStream API in batch mode (FLINK-20897)
> [PLANNED in 1.14]
>
> Step 7: Reach feature parity of Table API/DataStream API with DataSet API
> [PLANNED for 1.14++]
>
> We need to identify blockers when migrating from DataSet API to Table
> API/DataStream API. Here we need to estabilish a good feedback pipeline
> to include DataSet users in the roadmap planning.
>
> Step 7: Drop the Gelly library
>
> No concrete plan yet. Latest would be the next major Flink version aka
> Flink 2.0.
>
> Step 8: Drop DataSet API
>
> Planned for the next major Flink version aka Flink 2.0.
>
>
> Please let me know if this matches your thoughts. We can also convert
> this into a blog post or mention it in the next release notes.
>
> Regards,
> Timo
>
>

Reply via email to