>> I believe that we could support such an async sink writer >> very easily in the future. What do you think?
>> How would you see the expansion in the future? Do you mean just adding `isAvailable()` method with a default implementation later on? Hi @piotr <pi...@ververica.com> Actually I am not sure adding `isAvailable` is enough. Maybe it is not. But for the initial version I hope we could make the sink api sync because there is already a lot of stuff that has to finish. :--) What do you think? Best, Guowei On Tue, Sep 22, 2020 at 5:25 PM Aljoscha Krettek <aljos...@apache.org> wrote: > On 22.09.20 11:10, Guowei Ma wrote: > > 1. I think maybe we could add a EOI interface to the `GlobalCommit`. So > > that we could make `write success file` be available in both batch and > > stream execution mode. > > We could, yes. I'm now hesitant because we're adding more things but I > think it should be fine. > > > 2. If we choose to let the two types of committer appear at the same > time > > in the API we have to figure out how to express the relation between the > > two committers. I think the Sink API may look like the following: What do > > you think? > > Sink<T, CommT, CommR, ShareStateT...> { > > Writer<T, CommT, ShareStateT.....> createWriter(); > > Optional<Committer<CommT>> createCommitter(); > > Optional<GlobalComiitter<CommT, GlobalCommT>> > > createGlobalCommitter(); > > } > > Yes, I think this is what we should do. Though I think that we should > initially not support shared state. The FileSink only uses this to > create unique file names and I think we can do without it. If we see > that we need it later we can add it but I would like to keep things > minimal initially. It's always easy to add things later but it's hard to > take things away once you added them. > > > 3. Maybe a silly question: Why do we need `commit` return > `CommitResult`? I > > think the sink developer could rety himself. Why need the framework to do > > the retry? > > It's not a silly question at all! I think we need the retry to support > such problems as Steven mentioned. If a commit fails a RETRY tells the > framework that it should keep the commits in state and retry them on the > next checkpoint. When the committer returns FAILURE we should just fail > the job. It's to support temporary outages of the external metastore. > > I'm open to leaving it out of the initial version for the same reasons I > mentioned above but I think it could be valuable. > > Best, > Aljoscha > >