Hi devs,

I’d like to start a discussion about the current and future state of our
Flink Sink Connectors.


As it stands today, we currently have 3 sink implementations:

   1. FlinkSink [1]
   2. IcebergSink [2]
   3. DynamicSink [3]


FlinkSink [1] is the current and default implementation of the Flink Sink
Connector.


IcebergSink [2] is another implementation of the Flink Sink Connector which
was introduced in https://github.com/apache/iceberg/pull/10179.
<https://github.com/apache/iceberg/pull/10179> It leverages the latest
SinkV2 interfaces in Flink, and it offers the possibility of adding cleanup
tasks by the way of implementing the `PostCommitTopology` interface. There
is already some work in progress to enable this functionality:
https://github.com/apache/iceberg/pull/12979


DynamicSink [3] has been recently contributed in
https://github.com/apache/iceberg/pull/13304 and it can be used to write to
any number of tables, dynamically creating and updating tables and
dynamically updating the schema and partition spec of tables.


Currently, `IcebergSink` is marked as `@Experimental` and it already offers
feature parity with `FlinkSink` (the missing RANGE distribution was
recently merged https://github.com/apache/iceberg/pull/12071).


With https://github.com/apache/iceberg/pull/11244, users have the choice of
specifying which Sink Implementation ([1] or [2]) they want to use for
Flink SQL.


With all this said, we’re proposing the following for Iceberg* 1.11*:

   1. @Deprecate FlinkSink
   2. Promote IcebergSink from @Experimental to @PublicEvolving
   3. Make IcebergSink the default implementation in Flink SQL.


Then in Iceberg *1.12* we will:



   1. Remove the `FlinkSink` implementation.
   2. Remove @PublicEvolving from IcebergSink


What do you think about this plan?


Thanks,

Rodrigo

Reply via email to