>  Plus, it will disable the future capability of small file compaction
stage post commit.

I should clarify this comment. if we are using the `WithPostCommitTopology`
for global committer, we would lose the capability of using the post commit
stage for small files compaction.

On Tue, Aug 16, 2022 at 9:53 AM Steven Wu <stevenz...@gmail.com> wrote:

>
> In the V1 sink interface, there is a GlobalCommitter for Iceberg. With the
> V2 sink interface,  GlobalCommitter has been deprecated by
> WithPostCommitTopology. I thought the post commit stage is mainly for async
> maintenance (like compaction).
>
> Are we supposed to do sth similar to the GlobalCommittingSinkAdapter? It
> seems like a temporary transition plan for bridging v1 sinks to v2
> interfaces.
>
> private class GlobalCommittingSinkAdapter extends 
> TwoPhaseCommittingSinkAdapter
>         implements WithPostCommitTopology<InputT, CommT> {
>     @Override
>     public void addPostCommitTopology(DataStream<CommittableMessage<CommT>> 
> committables) {
>        StandardSinkTopologies.addGlobalCommitter(
>                 committables,
>                 GlobalCommitterAdapter::new,
>                 () -> sink.getCommittableSerializer().get());
>     }
> }
>
>
> In the Iceberg PR [1] for adopting the new sink interface, Liwei used the
> "global" partitioner to force all committables go to a single committer
> task 0. It will effectively force a global committer disguised in the
> parallel committers. It is a little weird and also can lead to questions
> why other committer tasks are not getting any messages. Plus, it will
> disable the future capability of small file compaction stage post commit.
> Hence, I am asking what is the right approach to achieve global committer
> behavior.
>
> Thanks,
> Steven
>
> [1] https://github.com/apache/iceberg/pull/4904/files#r946975047
>

Reply via email to