My apologies, had a typo. Meant to say "I will now open up a vote."
Thanks, Aakash On Sun, May 17, 2020 at 4:55 PM Aakash Shah <[email protected]> wrote: > Hi all, > > Thanks for all the feedback thus far. I've updated the KIP with all the > suggestions. I will not open up a vote. > > Thanks, > Aakash > > On Sun, May 17, 2020 at 3:45 PM Randall Hauch <[email protected]> wrote: > >> All good points regarding `Future<Void>` instead of >> `Future<RecordMetadata>`, so +1 to that change. >> >> A few more nits. The following sentences should be removed because they >> actually describe a change from the current DLQ functionality that already >> sets `max.in.flight.requests.per.connection=1` by default: >> >> "In order to avoid error records being written out of order (for example, >> due to retries), the developer should always use >> max.in.flight.requests.per.connection=1 in their implementation for >> writing >> error records. If the developer determines that order is not important and >> they want extreme performance, they can always increase this number." >> >> Another is a bit ambiguous, so I suggest changing: >> >> "The error reporting functionality is designed to be asynchronous but can >> be made synchronous if desired. By default, error reporting will be >> asynchronous; processing of the subsequent errant record will not be >> blocked by the successful processing of the prior errant record. However, >> if the developer prefers synchronous functionality, they can block >> processing of the next record with future.get()." >> >> to: >> >> "The error reporting functionality is asynchronous. Tasks can use the >> resulting future to wait for the record and exception to be written to >> Kafka." >> >> Can we please move the example to a new "Example Usage" section that is >> *after* the "Interface" section? That way the order of the sections is >> "Method", "Interface", and "Example Usage", and it's more clear how the >> API >> is being changed. Also, the first sentence introducing the example is >> currently: >> >> "The usage will look like the following:" >> >> but IMO it should actually say it's an example: >> >> "The following is an example of how a sink task can use this error >> reporter >> and support connectors being deployed in earlier versions of the Connect >> runtime:" >> >> It seems we have pretty good consensus, so I think the KIP is ready for a >> vote after the above minor corrections are made. >> >> Best regards, >> >> Randall >> >> On Sun, May 17, 2020 at 4:51 PM Arjun Satish <[email protected]> >> wrote: >> >> > Thanks for all the feedback, folks. >> > >> > re: having a callback as a parameter, I agree that at this point, it >> might >> > not add much value to the proposal. >> > >> > re: synchronous vs asynchronous, is the motivation performance/higher >> > throughput? Taking a step back, calling report(..) in the new interface >> > does a couple of things: >> > >> > 1. at a fundamental level, it is a signal to the framework that a >> failure >> > occurred when processing records, specifically due to the given record. >> > 2. depending on whether errors.log and errors.deadletterqueue has been >> set, >> > some messages are written to zero or more destinations. >> > 3. depending on the value of errors.tolerance (none or all), the task is >> > failed after reporters have completed. >> > >> > for kip-610, the asynchronous method has the advantage of working with >> the >> > internal dead letter queue (which has been transparent to the developer >> so >> > far). but, how does async method help if the DLQ is not enabled? in this >> > case RecordMetadata is not very useful, AFAICT? also, if we add more >> error >> > reporters in the future (say, for example, a new reporter in a future >> that >> > writes to a RDBMS), would the async version return success on all or >> > nothing, and what about partial successes? >> > >> > overall, if we really need async behavior, I'd prefer to just use >> > Future<Void>. but if we can keep it simple, then let's go with a >> > synchronous function with the parameters Randall proposed above (with >> > return type as void, and if any of the reporters fail, the task is >> failed >> > if error.tolerance is none, and kept alive if tolerance is all), and >> maybe >> > add asynchronous methods in a future KIP? >> > >> > Best, >> > >> >
