On Sun, May 17, 2020 at 11:01 AM Magesh kumar Nandakumar < [email protected]> wrote:
> Thanks Randall. The suggestion i made also has a problem when reporter > isn't enabled where it could potentially write records after error records > to sink before failing. > > The other concern i had with reporter being asynchronous. For some reason > if the reporter is taking longer because of say a specific broker issue, > the connector might still move forward and commit if it's not waiting for > the reporter. During this if the worker crashes we will now lose the bad > record > I don't think this is desirable behavior. I think the synchronous reporter > provides better guarantees for all connectors. > > Thanks, Magesh. That's a valid concern, and maybe that will affect how the feature is actually implemented. I expect it to be a bit tricky to ensure that errant records are fully written to Kafka before the offsets are committed, so it might be simplest to start out with a synchronous implementation. But the API can still be an asynchronous design whether or not the implementation is synchronous. That gives us the ability in the future to change the implementation if we determine a way to handle all concerns. For example, the WorkerSinkTask may need to backoff if waiting to commit due to too many incomplete/unacknowledged reporter requests. OTOH, if we make the `report` method(s) synchronous from the beginning, it will be very challenging to change them in the future to be asynchronous. I guess it boils down to this question: do we know today that we will *never* want the reporter to write asynchronously? Best regards, Randall
