Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

Aakash Shah Sun, 17 May 2020 16:58:25 -0700

My apologies, had a typo. Meant to say "I will now open up a vote."


Thanks,
Aakash

On Sun, May 17, 2020 at 4:55 PM Aakash Shah <as...@confluent.io> wrote:

> Hi all,
>
> Thanks for all the feedback thus far. I've updated the KIP with all the
> suggestions. I will not open up a vote.
>
> Thanks,
> Aakash
>
> On Sun, May 17, 2020 at 3:45 PM Randall Hauch <rha...@gmail.com> wrote:
>
>> All good points regarding `Future<Void>` instead of
>> `Future<RecordMetadata>`, so +1 to that change.
>>
>> A few more nits. The following sentences should be removed because they
>> actually describe a change from the current DLQ functionality that already
>> sets `max.in.flight.requests.per.connection=1` by default:
>>
>> "In order to avoid error records being written out of order (for example,
>> due to retries), the developer should always use
>> max.in.flight.requests.per.connection=1 in their implementation for
>> writing
>> error records. If the developer determines that order is not important and
>> they want extreme performance, they can always increase this number."
>>
>> Another is a bit ambiguous, so I suggest changing:
>>
>> "The error reporting functionality is designed to be asynchronous but can
>> be made synchronous if desired. By default, error reporting will be
>> asynchronous; processing of the subsequent errant record will not be
>> blocked by the successful processing of the prior errant record. However,
>> if the developer prefers synchronous functionality, they can block
>> processing of the next record with future.get()."
>>
>> to:
>>
>> "The error reporting functionality is asynchronous. Tasks can use the
>> resulting future to wait for the record and exception to be written to
>> Kafka."
>>
>> Can we please move the example to a new "Example Usage" section that is
>> *after* the "Interface" section? That way the order of the sections is
>> "Method", "Interface", and "Example Usage", and it's more clear how the
>> API
>> is being changed. Also, the first sentence introducing the example is
>> currently:
>>
>> "The usage will look like the following:"
>>
>> but IMO it should actually say it's an example:
>>
>> "The following is an example of how a sink task can use this error
>> reporter
>> and support connectors being deployed in earlier versions of the Connect
>> runtime:"
>>
>> It seems we have pretty good consensus, so I think the KIP is ready for a
>> vote after the above minor corrections are made.
>>
>> Best regards,
>>
>> Randall
>>
>> On Sun, May 17, 2020 at 4:51 PM Arjun Satish <arjun.sat...@gmail.com>
>> wrote:
>>
>> > Thanks for all the feedback, folks.
>> >
>> > re: having a callback as a parameter, I agree that at this point, it
>> might
>> > not add much value to the proposal.
>> >
>> > re: synchronous vs asynchronous, is the motivation performance/higher
>> > throughput? Taking a step back, calling report(..) in the new interface
>> > does a couple of things:
>> >
>> > 1. at a fundamental level, it is a signal to the framework that a
>> failure
>> > occurred when processing records, specifically due to the given record.
>> > 2. depending on whether errors.log and errors.deadletterqueue has been
>> set,
>> > some messages are written to zero or more destinations.
>> > 3. depending on the value of errors.tolerance (none or all), the task is
>> > failed after reporters have completed.
>> >
>> > for kip-610, the asynchronous method has the advantage of working with
>> the
>> > internal dead letter queue (which has been transparent to the developer
>> so
>> > far). but, how does async method help if the DLQ is not enabled? in this
>> > case RecordMetadata is not very useful, AFAICT? also, if we add more
>> error
>> > reporters in the future (say, for example, a new reporter in a future
>> that
>> > writes to a RDBMS), would the async version return success on all or
>> > nothing, and what about partial successes?
>> >
>> > overall, if we really need async behavior, I'd prefer to just use
>> > Future<Void>. but if we can keep it simple, then let's go with a
>> > synchronous function with the parameters Randall proposed above (with
>> > return type as void, and if any of the reporters fail, the task is
>> failed
>> > if error.tolerance is none, and kept alive if tolerance is all), and
>> maybe
>> > add asynchronous methods in a future KIP?
>> >
>> > Best,
>> >
>>
>

Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

Reply via email to