Thanks Jason, my hope is that we could streamline the callback function
signature, so that in the next major release (3.0),
onCompletion(RecordMetadata metadata, SendFailure sendFailure) will become
the only API to be implemented.

On Fri, Jan 29, 2021 at 3:42 PM Jason Gustafson <ja...@confluent.io> wrote:

> Hi Boyang,
>
> Do you think it's necessary to deprecate the old `onCompletion` callback? I
> was thinking there's probably no harm leaving it around. Users might not
> care about the failure type. Other than that, it looks good to me.
>
> Thanks,
> Jason
>
> On Thu, Jan 28, 2021 at 6:58 PM Boyang Chen <reluctanthero...@gmail.com>
> wrote:
>
> > Hey Guozhang,
> >
> > the CommitFailedException wrapping would cover all the non-fatal
> exceptions
> > as we listed out in the KIP, generally speaking any exception that could
> > recover safely by calling abortTxn should be wrapped.
> >
> > Best,
> > Boyang
> >
> > On Thu, Jan 28, 2021 at 5:22 PM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hey Boyang,
> > >
> > > I think maybe there's something cracking here :) I'm just asking for
> > > clarifications that as of today, which non-fatal exceptions the newly
> > > introduced CommitFailedException would cover, and it seems to be only
> 1)
> > > unknown pid, 2) invalid pid mapping, and 3) concurrent transactions. Is
> > > that correct?
> > >
> > > On Thu, Jan 28, 2021 at 5:06 PM Boyang Chen <
> reluctanthero...@gmail.com>
> > > wrote:
> > >
> > > > Hey Guozhang, I think TimeoutException would not be covered here as
> it
> > > > potentially has a risk of hitting an illegal state on the broker side
> > > when
> > > > the previous commit was actually successful. Users should try to
> > increase
> > > > their max.block.ms to avoid hitting the timeout as a base
> suggestion,
> > > > which
> > > > is discussed in the KIP.
> > > >
> > > > As for the umbrella exception, I agree it has values to some extent,
> > but
> > > in
> > > > terms of adoption difficulty, additional exception types usually make
> > the
> > > > error handling more complex than simplified, and we are doing our
> best
> > to
> > > > avoid compatibility issues caused by wrapping previously thrown raw
> > > > exceptions that invalidates user error handling. And I'm also not in
> > > favor
> > > > of our exception hierarchy today where all exceptions are subclasses
> of
> > > > KafkaException, and we already mixed its handling with other specific
> > > > exception types that could be caught as superclass KafkaException.
> That
> > > > could be addressed in a separate proposal if we see necessary.
> > > >
> > > > Boyang
> > > >
> > > > On Thu, Jan 28, 2021 at 3:28 PM Guozhang Wang <wangg...@gmail.com>
> > > wrote:
> > > >
> > > > > Thanks Jason for the suggestion, that looks good to me too.
> > > > >
> > > > > Regarding the non-fatal exceptions wrapped as CommitFailed, I would
> > > like
> > > > to
> > > > > clarify if we would cover all the following cases: 1) timeout, 2)
> > > unknown
> > > > > pid, 3) invalid pid mapping, 4) concurrent transactions?
> > > > >
> > > > > BTW I think it still makes sense to use an umbrella exception in
> case
> > > in
> > > > > the future we add more non-fatal cases even though today we only
> > have a
> > > > > few.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Thu, Jan 28, 2021 at 1:08 PM Boyang Chen <
> > > reluctanthero...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks Jason, I agree with the proposed solution here, will
> update
> > > the
> > > > > KIP.
> > > > > >
> > > > > > On Thu, Jan 28, 2021 at 10:52 AM Jason Gustafson <
> > ja...@confluent.io
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Boyang,
> > > > > > >
> > > > > > > It seems like a reasonable suggestion. I wonder if a flag is
> > > > sufficient
> > > > > > > though. The current `Callback` documentation treats "fatal"
> > errors
> > > > from
> > > > > > the
> > > > > > > perspective of the individual message that was sent.
> > > > > > >
> > > > > > > ```
> > > > > > >     *                  Non-Retriable exceptions (fatal, the
> > message
> > > > > will
> > > > > > > never be sent):
> > > > > > > ```
> > > > > > >
> > > > > > > However, we also have fatal errors from the perspective of the
> > > > > > transaction
> > > > > > > (e.g. when the producer gets fenced). Perhaps that suggests we
> > need
> > > > > > > something richer than a boolean flag:
> > > > > > >
> > > > > > > At a high level, I think the following cases are possible:
> > > > > > >
> > > > > > > - message rejected (e.g. message too large, invalid topic)
> > > > > > > - delivery failed after retries/delivery timeout (e.g. timeout,
> > crc
> > > > > > error,
> > > > > > > not enough replicas)
> > > > > > > - transaction failed (e.g. producer fenced, invalid transaction
> > > > state)
> > > > > > >
> > > > > > > Perhaps instead we can have a type like the following:
> > > > > > >
> > > > > > > class SendFailure {
> > > > > > >   FailureType failureType;
> > > > > > >   Exception cause;
> > > > > > > }
> > > > > > >
> > > > > > > enum FailureType {
> > > > > > >   MESSSAGE_REJECTED, DELIVERY_FAILED, TRANSACTION_FAILED
> > > > > > > }
> > > > > > >
> > > > > > > (Not married to any of these names, just a starting point.)
> > > > > > >
> > > > > > > Then we add a new `onCompletion` as you've suggested:
> > > > > > >
> > > > > > > default void onCompletion(RecordMetadata metadata, SendFailure
> > > > > failure) {
> > > > > > >   this.onCompletion(metadata, failure.cause());
> > > > > > > }
> > > > > > >
> > > > > > > This would give streams and other applications enough
> information
> > > to
> > > > > know
> > > > > > > whether the message can be retried and whether the transaction
> > can
> > > be
> > > > > > > aborted.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > -Jason
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jan 27, 2021 at 9:51 PM Boyang Chen <
> > > > > reluctanthero...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Jason for the thoughts.
> > > > > > > >
> > > > > > > > On Wed, Jan 27, 2021 at 11:52 AM Jason Gustafson <
> > > > ja...@confluent.io
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Boyang,
> > > > > > > > >
> > > > > > > > > Thanks for the iterations here. I think this is something
> we
> > > > should
> > > > > > > have
> > > > > > > > > done a long time ago. It sounds like there are two API
> > changes
> > > > > here:
> > > > > > > > >
> > > > > > > > > 1. We are introducing the `CommitFailedException` to wrap
> > > > abortable
> > > > > > > > > errors that are raised from `commitTransaction`. This
> sounds
> > > fine
> > > > > to
> > > > > > > me.
> > > > > > > > As
> > > > > > > > > far as I know, the only case we might need this is when we
> > add
> > > > > > support
> > > > > > > to
> > > > > > > > > let producers recover from coordinator timeouts. Are there
> > any
> > > > > > others?
> > > > > > > > >
> > > > > > > > > I think the purpose here is to ensure non-fatal exceptions
> > are
> > > > > > unified
> > > > > > > > under the same
> > > > > > > > exception umbrella, to make the proceeding to abort any
> ongoing
> > > > > > > transaction
> > > > > > > > much easier.
> > > > > > > > I don't think `coordinator timeouts` is the only case to
> > recover
> > > > > here,
> > > > > > > > since we have other
> > > > > > > > non-fatal exceptions such as UnknownPid.
> > > > > > > >
> > > > > > > > 2. We are wrapping non-fatal errors raised from `send` as
> > > > > > > `KafkaException`.
> > > > > > > > > The motivation for this is less clear to me and it doesn't
> > look
> > > > > like
> > > > > > > the
> > > > > > > > > example from the KIP depends on it. My concern here is
> > > > > compatibility.
> > > > > > > > > Currently we have the following documentation for the
> > > `Callback`
> > > > > API:
> > > > > > > > >
> > > > > > > > > ```
> > > > > > > > >      *                  Non-Retriable exceptions (fatal,
> the
> > > > > message
> > > > > > > will
> > > > > > > > > never be sent):
> > > > > > > > >      *
> > > > > > > > >      *                  InvalidTopicException
> > > > > > > > >      *                  OffsetMetadataTooLargeException
> > > > > > > > >      *                  RecordBatchTooLargeException
> > > > > > > > >      *                  RecordTooLargeException
> > > > > > > > >      *                  UnknownServerException
> > > > > > > > >      *                  UnknownProducerIdException
> > > > > > > > >      *                  InvalidProducerEpochException
> > > > > > > > >      *
> > > > > > > > >      *                  Retriable exceptions (transient,
> may
> > be
> > > > > > covered
> > > > > > > > by
> > > > > > > > > increasing #.retries):
> > > > > > > > >      *
> > > > > > > > >      *                  CorruptRecordException
> > > > > > > > >      *                  InvalidMetadataException
> > > > > > > > >      *
> NotEnoughReplicasAfterAppendException
> > > > > > > > >      *                  NotEnoughReplicasException
> > > > > > > > >      *                  OffsetOutOfRangeException
> > > > > > > > >      *                  TimeoutException
> > > > > > > > >      *                  UnknownTopicOrPartitionException
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > > If we wrap all the retriable exceptions documented here as
> > > > > > > > > `KafkaException`, wouldn't that break any error handling
> that
> > > > users
> > > > > > > might
> > > > > > > > > already have? it's gonna introduce a compatibility issue.
> > > > > > > > >
> > > > > > > > > The original intention was to simplify `send` callback
> error
> > > > > handling
> > > > > > > by
> > > > > > > > doing exception wrapping, as on Streams level
> > > > > > > > we have to prepare an exhausting list of exceptions to catch
> as
> > > > > fatal,
> > > > > > > and
> > > > > > > > the same lengthy list to catch as
> > > > > > > > non-fatal. It would be much easier if we got `hints` from the
> > > > > callback.
> > > > > > > > However,
> > > > > > > > I agree there is a compatibility concern, what about
> > deprecating
> > > > the
> > > > > > > > existing:
> > > > > > > >
> > > > > > > > void onCompletion(RecordMetadata metadata, Exception
> exception)
> > > > > > > >
> > > > > > > > and replace it with:
> > > > > > > >
> > > > > > > > default void onCompletion(RecordMetadata metadata, Exception
> > > > > exception,
> > > > > > > > boolean isFatal) {
> > > > > > > >   this.onCompletion(metadata, exception);
> > > > > > > > }
> > > > > > > >
> > > > > > > > to make sure new users get the benefit of understanding the
> > > > fatality
> > > > > > > based
> > > > > > > > on the info presented by the producer?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > > Jason
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sat, Jan 23, 2021 at 3:31 AM Hiringuru <
> > i...@hiringuru.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Why  we are receiving all emails kindly remove us from
> > > > > > > > > > dev@kafka.apache.org we don't want to receive emails
> > > anymore.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > > > On 01/23/2021 4:14 AM Guozhang Wang <
> wangg...@gmail.com>
> > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks Boyang, yes I think I was confused about the
> > > different
> > > > > > > > handling
> > > > > > > > > of
> > > > > > > > > > > two abortTxn calls, and now I get it was not
> > intentional. I
> > > > > > think I
> > > > > > > > do
> > > > > > > > > > not
> > > > > > > > > > > have more concerns.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Jan 22, 2021 at 1:12 PM Boyang Chen <
> > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the clarification Guozhang, I got your
> point
> > > > that
> > > > > we
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > > have a consistent handling of fatal exceptions being
> > > thrown
> > > > > > from
> > > > > > > > the
> > > > > > > > > > > > abortTxn. I modified the current template to move the
> > > fatal
> > > > > > > > exception
> > > > > > > > > > > > try-catch outside of the processing loop to make sure
> > we
> > > > > could
> > > > > > > get
> > > > > > > > a
> > > > > > > > > > chance
> > > > > > > > > > > > to close consumer/producer modules. Let me know what
> > you
> > > > > think.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Boyang
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Jan 22, 2021 at 11:05 AM Boyang Chen <
> > > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > My understanding is that abortTransaction would
> only
> > > > throw
> > > > > > when
> > > > > > > > the
> > > > > > > > > > > > > producer is in fatal state. For CommitFailed, the
> > > > producer
> > > > > > > should
> > > > > > > > > > still
> > > > > > > > > > > > be
> > > > > > > > > > > > > in the abortable error state, so that
> > abortTransaction
> > > > call
> > > > > > > would
> > > > > > > > > not
> > > > > > > > > > > > throw.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Jan 22, 2021 at 11:02 AM Guozhang Wang <
> > > > > > > > wangg...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> Or are you going to maintain some internal state
> > such
> > > > > that,
> > > > > > > the
> > > > > > > > > > > > >> `abortTransaction` in the catch block would never
> > > throw
> > > > > > again?
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Fri, Jan 22, 2021 at 11:01 AM Guozhang Wang <
> > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> > Hi Boyang/Jason,
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > I've also thought about this (i.e. using
> > > CommitFailed
> > > > > for
> > > > > > > all
> > > > > > > > > > > > >> non-fatal),
> > > > > > > > > > > > >> > but what I'm pondering is that, in the catch
> > > > > > (CommitFailed)
> > > > > > > > > block,
> > > > > > > > > > > > what
> > > > > > > > > > > > >> > would happen if the
> `producer.abortTransaction();`
> > > > > throws
> > > > > > > > again?
> > > > > > > > > > > > should
> > > > > > > > > > > > >> > that be captured as a fatal and cause the client
> > to
> > > > > close
> > > > > > > > again.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > If yes, then naively the pattern would be:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > ...
> > > > > > > > > > > > >> > catch (CommitFailedException e) {
> > > > > > > > > > > > >> >         // Transaction commit failed with
> > abortable
> > > > > error,
> > > > > > > > user
> > > > > > > > > > could
> > > > > > > > > > > > >> reset
> > > > > > > > > > > > >> >         // the application state and resume
> with a
> > > new
> > > > > > > > > > transaction.
> > > > > > > > > > > > The
> > > > > > > > > > > > >> > root
> > > > > > > > > > > > >> >         // cause was wrapped in the thrown
> > > exception.
> > > > > > > > > > > > >> >         resetToLastCommittedPositions(consumer);
> > > > > > > > > > > > >> >         try {
> > > > > > > > > > > > >> >             producer.abortTransaction();
> > > > > > > > > > > > >> >         } catch (KafkaException e) {
> > > > > > > > > > > > >> >             producer.close();
> > > > > > > > > > > > >> >             consumer.close();
> > > > > > > > > > > > >> >             throw e;
> > > > > > > > > > > > >> >         }
> > > > > > > > > > > > >> >     } catch (KafkaException e) {
> > > > > > > > > > > > >> >         producer.close();
> > > > > > > > > > > > >> >         consumer.close();
> > > > > > > > > > > > >> >         throw e;
> > > > > > > > > > > > >> >     }
> > > > > > > > > > > > >> > ...
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Guozhang
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > On Fri, Jan 22, 2021 at 10:47 AM Boyang Chen <
> > > > > > > > > > > > >> reluctanthero...@gmail.com>
> > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >> Hey Guozhang,
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> Jason and I were discussing the new API offline
> > and
> > > > > > decided
> > > > > > > > to
> > > > > > > > > > take
> > > > > > > > > > > > >> >> another
> > > > > > > > > > > > >> >> approach. Firstly, the reason not to invent a
> new
> > > API
> > > > > > with
> > > > > > > > > > returned
> > > > > > > > > > > > >> >> boolean
> > > > > > > > > > > > >> >> flag is for compatibility consideration, since
> > old
> > > > EOS
> > > > > > code
> > > > > > > > > > would not
> > > > > > > > > > > > >> know
> > > > > > > > > > > > >> >> that a given transaction commit was failed
> > > internally
> > > > > as
> > > > > > > they
> > > > > > > > > > don't
> > > > > > > > > > > > >> listen
> > > > > > > > > > > > >> >> to the output flag. Our proposed alternative
> > > solution
> > > > > is
> > > > > > to
> > > > > > > > let
> > > > > > > > > > > > >> >> *commitTransaction
> > > > > > > > > > > > >> >> throw CommitFailedException whenever the commit
> > > > failed
> > > > > > with
> > > > > > > > > > non-fatal
> > > > > > > > > > > > >> >> exception*, so that on the caller side the
> > handling
> > > > > logic
> > > > > > > > > > becomes:
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> try {
> > > > > > > > > > > > >> >>         if (shouldCommit) {
> > > > > > > > > > > > >> >>             producer.commitTransaction();
> > > > > > > > > > > > >> >>         } else {
> > > > > > > > > > > > >> >>
> > >  resetToLastCommittedPositions(consumer);
> > > > > > > > > > > > >> >>             producer.abortTransaction();
> > > > > > > > > > > > >> >>         }
> > > > > > > > > > > > >> >>     } catch (CommitFailedException e) {
> > > > > > > > > > > > >> >>         // Transaction commit failed with
> > abortable
> > > > > > error,
> > > > > > > > user
> > > > > > > > > > could
> > > > > > > > > > > > >> >> reset
> > > > > > > > > > > > >> >>         // the application state and resume
> with
> > a
> > > > new
> > > > > > > > > > transaction.
> > > > > > > > > > > > The
> > > > > > > > > > > > >> >> root
> > > > > > > > > > > > >> >>         // cause was wrapped in the thrown
> > > exception.
> > > > > > > > > > > > >> >>
>  resetToLastCommittedPositions(consumer);
> > > > > > > > > > > > >> >>         producer.abortTransaction();
> > > > > > > > > > > > >> >>     } catch (KafkaException e) {
> > > > > > > > > > > > >> >>         producer.close();
> > > > > > > > > > > > >> >>         consumer.close();
> > > > > > > > > > > > >> >>         throw e;
> > > > > > > > > > > > >> >>     }
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> This approach looks cleaner as all exception
> > types
> > > > > other
> > > > > > > than
> > > > > > > > > > > > >> CommitFailed
> > > > > > > > > > > > >> >> will doom to be fatal, which is very easy to
> > adopt
> > > > for
> > > > > > > users.
> > > > > > > > > In
> > > > > > > > > > the
> > > > > > > > > > > > >> >> meantime, we still maintain the commitTxn
> > behavior
> > > to
> > > > > > throw
> > > > > > > > > > instead
> > > > > > > > > > > > of
> > > > > > > > > > > > >> >> silently failing.
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> In addition, we decided to drop the
> > recommendation
> > > to
> > > > > > > handle
> > > > > > > > > > > > >> >> TimeoutException and leave it to the users to
> > make
> > > > the
> > > > > > > call.
> > > > > > > > > The
> > > > > > > > > > > > >> downside
> > > > > > > > > > > > >> >> for blindly calling abortTxn upon timeout is
> that
> > > we
> > > > > > could
> > > > > > > > > > result in
> > > > > > > > > > > > an
> > > > > > > > > > > > >> >> illegal state when the commit was already
> > > successful
> > > > on
> > > > > > the
> > > > > > > > > > broker
> > > > > > > > > > > > >> >> side. Without a good guarantee on the outcome,
> > > > > > > > overcomplicating
> > > > > > > > > > the
> > > > > > > > > > > > >> >> template should not be encouraged IMHO.
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> Let me know your thoughts on the new approach
> > here,
> > > > > thank
> > > > > > > > you!
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> Best,
> > > > > > > > > > > > >> >> Boyang
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> On Tue, Jan 19, 2021 at 11:11 AM Guozhang Wang
> <
> > > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > >> >> wrote:
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >> > Thanks for your clarification on 2)/3), that
> > > makes
> > > > > > sense.
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >> > On Tue, Jan 19, 2021 at 10:16 AM Boyang Chen
> <
> > > > > > > > > > > > >> >> reluctanthero...@gmail.com>
> > > > > > > > > > > > >> >> > wrote:
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >> > > Thanks for the input Guozhang, replied
> > inline.
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > On Mon, Jan 18, 2021 at 8:57 PM Guozhang
> > Wang <
> > > > > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > >> >> > wrote:
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > > Hello Boyang,
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > Thanks for the updated KIP. I read it
> again
> > > and
> > > > > > have
> > > > > > > > the
> > > > > > > > > > > > >> following
> > > > > > > > > > > > >> >> > > > thoughts:
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > 0. I'm a bit concerned that if commitTxn
> > does
> > > > not
> > > > > > > throw
> > > > > > > > > any
> > > > > > > > > > > > >> >> non-fatal
> > > > > > > > > > > > >> >> > > > exception, and instead we rely on the
> > > > subsequent
> > > > > > > > beginTxn
> > > > > > > > > > call
> > > > > > > > > > > > to
> > > > > > > > > > > > >> >> > throw,
> > > > > > > > > > > > >> >> > > it
> > > > > > > > > > > > >> >> > > > may violate some callers with a pattern
> > that
> > > > > > relying
> > > > > > > on
> > > > > > > > > > > > >> commitTxn to
> > > > > > > > > > > > >> >> > > > succeed to make some non-rollback
> > operations.
> > > > For
> > > > > > > > > example:
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > beginTxn()
> > > > > > > > > > > > >> >> > > > // do some read-write on my local DB
> > > > > > > > > > > > >> >> > > > commitTxn()
> > > > > > > > > > > > >> >> > > > // if commitTxn succeeds, then commit the
> > DB
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > -------------
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > The issue is that, committing DB is a
> > > > > non-rollback
> > > > > > > > > > operation,
> > > > > > > > > > > > and
> > > > > > > > > > > > >> >> users
> > > > > > > > > > > > >> >> > > may
> > > > > > > > > > > > >> >> > > > just rely on commitTxn to return without
> > > error
> > > > to
> > > > > > > make
> > > > > > > > > this
> > > > > > > > > > > > >> >> > non-rollback
> > > > > > > > > > > > >> >> > > > call. Of course we can just claim this
> > > pattern
> > > > is
> > > > > > not
> > > > > > > > > > > > legitimate
> > > > > > > > > > > > >> >> and is
> > > > > > > > > > > > >> >> > > not
> > > > > > > > > > > > >> >> > > > the right way of doing things, but many
> > users
> > > > may
> > > > > > > > > naturally
> > > > > > > > > > > > adopt
> > > > > > > > > > > > >> >> this
> > > > > > > > > > > > >> >> > > > pattern.
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > So maybe we should still let commitTxn
> also
> > > > throw
> > > > > > > > > non-fatal
> > > > > > > > > > > > >> >> exceptions,
> > > > > > > > > > > > >> >> > > in
> > > > > > > > > > > > >> >> > > > which case we would then call abortTxn
> > again.
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > But if we do this, the pattern becomes:
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > >> >> > > >    beginTxn()
> > > > > > > > > > > > >> >> > > >    // do something
> > > > > > > > > > > > >> >> > > > } catch (Exception) {
> > > > > > > > > > > > >> >> > > >    shouldCommit = false;
> > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > if (shouldCommit) {
> > > > > > > > > > > > >> >> > > >     try {
> > > > > > > > > > > > >> >> > > >         commitTxn()
> > > > > > > > > > > > >> >> > > >     } catch (...) {        // enumerate
> all
> > > > fatal
> > > > > > > > > > exceptions
> > > > > > > > > > > > >> >> > > >         shutdown()
> > > > > > > > > > > > >> >> > > >     } catch (KafkaException) {
> > > > > > > > > > > > >> >> > > >         // non-fatal
> > > > > > > > > > > > >> >> > > >         shouldCommit = false;
> > > > > > > > > > > > >> >> > > >     }
> > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > if (!shouldCommit && running()) {
> > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > >> >> > > >         abortTxn()
> > > > > > > > > > > > >> >> > > >     } catch (KafkaException) {
> > > > > > > > > > > > >> >> > > >         // only throw fatal
> > > > > > > > > > > > >> >> > > >         shutdown()
> > > > > > > > > > > > >> >> > > >     }
> > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > ---------------------
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > Which is much more complicated.
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > Thank makes me think, the alternative we
> > have
> > > > > > > discussed
> > > > > > > > > > offline
> > > > > > > > > > > > >> may
> > > > > > > > > > > > >> >> be
> > > > > > > > > > > > >> >> > > > better: let commitTxn() to return a
> boolean
> > > > flag.
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > * If it returns true, it means the commit
> > > > > > succeeded.
> > > > > > > > > Users
> > > > > > > > > > can
> > > > > > > > > > > > >> >> > > comfortably
> > > > > > > > > > > > >> >> > > > continue and do any external non-rollback
> > > > > > operations
> > > > > > > if
> > > > > > > > > > they
> > > > > > > > > > > > >> like.
> > > > > > > > > > > > >> >> > > > * If it returns false, it means the
> commit
> > > > failed
> > > > > > > with
> > > > > > > > > > > > non-fatal
> > > > > > > > > > > > >> >> error
> > > > > > > > > > > > >> >> > > *AND
> > > > > > > > > > > > >> >> > > > the txn has been aborted*. Users do not
> > need
> > > to
> > > > > > call
> > > > > > > > > > abortTxn
> > > > > > > > > > > > >> again.
> > > > > > > > > > > > >> >> > > > * If it throws, then it means fatal
> errors.
> > > > Users
> > > > > > > > should
> > > > > > > > > > shut
> > > > > > > > > > > > >> down
> > > > > > > > > > > > >> >> the
> > > > > > > > > > > > >> >> > > > client.
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > In this case, the pattern becomes:
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > >> >> > > >    beginTxn()
> > > > > > > > > > > > >> >> > > >    // do something
> > > > > > > > > > > > >> >> > > > } catch (Exception) {
> > > > > > > > > > > > >> >> > > >    shouldCommit = false;
> > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > >> >> > > >     if (shouldCommit) {
> > > > > > > > > > > > >> >> > > >         commitSucceeded = commitTxn()
> > > > > > > > > > > > >> >> > > >     } else {
> > > > > > > > > > > > >> >> > > >         // reset offsets, rollback
> > > operations,
> > > > > etc
> > > > > > > > > > > > >> >> > > >         abortTxn()
> > > > > > > > > > > > >> >> > > >     }
> > > > > > > > > > > > >> >> > > > } catch (KafkaException) {
> > > > > > > > > > > > >> >> > > >     // only throw fatal
> > > > > > > > > > > > >> >> > > >     shutdown()
> > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > if (commitSucceeded)
> > > > > > > > > > > > >> >> > > >    // do other non-rollbackable things
> > > > > > > > > > > > >> >> > > > else
> > > > > > > > > > > > >> >> > > >    // reset offsets, rollback operations,
> > etc
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > -------------------------
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > Of course, if we want to have better
> > > visibility
> > > > > > into
> > > > > > > > what
> > > > > > > > > > > > caused
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> > > commit
> > > > > > > > > > > > >> >> > > > to fail and txn to abort. We can let the
> > > return
> > > > > > type
> > > > > > > be
> > > > > > > > > an
> > > > > > > > > > enum
> > > > > > > > > > > > >> >> instead
> > > > > > > > > > > > >> >> > > of
> > > > > > > > > > > > >> >> > > > a flag. But my main idea is to still let
> > the
> > > > > > > commitTxn
> > > > > > > > be
> > > > > > > > > > the
> > > > > > > > > > > > >> final
> > > > > > > > > > > > >> >> > point
> > > > > > > > > > > > >> >> > > > users can tell whether this txn succeeded
> > or
> > > > not,
> > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > >> >> relying on
> > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > >> >> > > > next beginTxn() call.
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > I agree that adding a boolean flag is
> > indeed
> > > > > useful
> > > > > > > in
> > > > > > > > > this
> > > > > > > > > > > > case.
> > > > > > > > > > > > >> >> Will
> > > > > > > > > > > > >> >> > > update the KIP.
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > 1. Re: "while maintaining the behavior to
> > throw
> > > > > fatal
> > > > > > > > > > exception
> > > > > > > > > > > > in
> > > > > > > > > > > > >> >> raw"
> > > > > > > > > > > > >> >> > not
> > > > > > > > > > > > >> >> > > > sure what do you mean by "throw" here.
> Are
> > > you
> > > > > > > > proposing
> > > > > > > > > > the
> > > > > > > > > > > > >> >> callback
> > > > > > > > > > > > >> >> > > would
> > > > > > > > > > > > >> >> > > > *pass* (not throw) in any fatal
> exceptions
> > > when
> > > > > > > > triggered
> > > > > > > > > > > > without
> > > > > > > > > > > > >> >> > > wrapping?
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > Yes, I want to say *pass*, the benefit is
> > to
> > > > make
> > > > > > the
> > > > > > > > end
> > > > > > > > > > > > user's
> > > > > > > > > > > > >> >> > > expectation consistent
> > > > > > > > > > > > >> >> > > regarding exception handling.
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > > 2. I'm not sure I understand the claim
> > > > regarding
> > > > > > the
> > > > > > > > > > callback
> > > > > > > > > > > > >> that
> > > > > > > > > > > > >> >> "In
> > > > > > > > > > > > >> >> > > EOS
> > > > > > > > > > > > >> >> > > > setup, it is not required to handle the
> > > > > exception".
> > > > > > > Are
> > > > > > > > > you
> > > > > > > > > > > > >> >> proposing
> > > > > > > > > > > > >> >> > > that,
> > > > > > > > > > > > >> >> > > > e.g. in Streams, we do not try to handle
> > any
> > > > > > > exceptions
> > > > > > > > > if
> > > > > > > > > > EOS
> > > > > > > > > > > > is
> > > > > > > > > > > > >> >> > enabled
> > > > > > > > > > > > >> >> > > > in the callback anymore, but just let
> > > > commitTxn()
> > > > > > > > itself
> > > > > > > > > to
> > > > > > > > > > > > fail
> > > > > > > > > > > > >> to
> > > > > > > > > > > > >> >> be
> > > > > > > > > > > > >> >> > > > notified about the problem? Personally I
> > > think
> > > > in
> > > > > > > > Streams
> > > > > > > > > > we
> > > > > > > > > > > > >> should
> > > > > > > > > > > > >> >> > just
> > > > > > > > > > > > >> >> > > > make the handling logic of the callback
> to
> > be
> > > > > > > > consistent
> > > > > > > > > > > > >> regardless
> > > > > > > > > > > > >> >> of
> > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > >> >> > > > EOS settings (today we have different
> logic
> > > > > > depending
> > > > > > > > on
> > > > > > > > > > this
> > > > > > > > > > > > >> logic,
> > > > > > > > > > > > >> >> > > which
> > > > > > > > > > > > >> >> > > > I think could be unified as well).
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > My idea originates from the claim on send
> > > API:
> > > > > > > > > > > > >> >> > > "When used as part of a transaction, it is
> > not
> > > > > > > necessary
> > > > > > > > to
> > > > > > > > > > > > define
> > > > > > > > > > > > >> a
> > > > > > > > > > > > >> >> > > callback or check the result of the future
> > in
> > > > > order
> > > > > > to
> > > > > > > > > > detect
> > > > > > > > > > > > >> errors
> > > > > > > > > > > > >> >> > from
> > > > > > > > > > > > >> >> > > <code>send</code>. "
> > > > > > > > > > > > >> >> > > My understanding is that for EOS, the
> > exception
> > > > > will
> > > > > > be
> > > > > > > > > > detected
> > > > > > > > > > > > by
> > > > > > > > > > > > >> >> > calling
> > > > > > > > > > > > >> >> > > transactional APIs either way, so it is a
> > > > duplicate
> > > > > > > > > handling
> > > > > > > > > > to
> > > > > > > > > > > > >> track
> > > > > > > > > > > > >> >> > > all the sendExceptions in RecordCollector.
> > > > > However, I
> > > > > > > > > looked
> > > > > > > > > > up
> > > > > > > > > > > > >> >> > > sendException is being used today as
> follow:
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > 1. Pass to "ProductionExceptionHandler" for
> > any
> > > > > > default
> > > > > > > > or
> > > > > > > > > > > > >> customized
> > > > > > > > > > > > >> >> > > exception handler to handle
> > > > > > > > > > > > >> >> > > 2. Stop collecting offset info or new
> > > exceptions
> > > > > > > > > > > > >> >> > > 3. Check and rethrow exceptions in close(),
> > > > flush()
> > > > > > or
> > > > > > > > new
> > > > > > > > > > send()
> > > > > > > > > > > > >> >> calls
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > To my understanding, we should still honor
> > the
> > > > > > > commitment
> > > > > > > > > to
> > > > > > > > > > #1
> > > > > > > > > > > > for
> > > > > > > > > > > > >> >> any
> > > > > > > > > > > > >> >> > > user customized implementation. The #2 does
> > not
> > > > > > really
> > > > > > > > > > affect our
> > > > > > > > > > > > >> >> > decision
> > > > > > > > > > > > >> >> > > upon EOS. The #3 here is still valuable as
> it
> > > > could
> > > > > > > help
> > > > > > > > us
> > > > > > > > > > fail
> > > > > > > > > > > > >> fast
> > > > > > > > > > > > >> >> in
> > > > > > > > > > > > >> >> > > new send() instead of waiting to later
> stage
> > of
> > > > > > > > processing.
> > > > > > > > > > In
> > > > > > > > > > > > that
> > > > > > > > > > > > >> >> > sense,
> > > > > > > > > > > > >> >> > > I agree we should continue to record send
> > > > > exceptions
> > > > > > > even
> > > > > > > > > > under
> > > > > > > > > > > > EOS
> > > > > > > > > > > > >> >> case
> > > > > > > > > > > > >> >> > to
> > > > > > > > > > > > >> >> > > ensure the strength of stream side Producer
> > > > logic.
> > > > > On
> > > > > > > the
> > > > > > > > > > safer
> > > > > > > > > > > > >> side,
> > > > > > > > > > > > >> >> we
> > > > > > > > > > > > >> >> > no
> > > > > > > > > > > > >> >> > > longer need to wrap certain fatal
> exceptions
> > > like
> > > > > > > > > > ProducerFenced
> > > > > > > > > > > > as
> > > > > > > > > > > > >> >> > > TaskMigrated, since they should not crash
> the
> > > > > stream
> > > > > > > > thread
> > > > > > > > > > if
> > > > > > > > > > > > >> thrown
> > > > > > > > > > > > >> >> in
> > > > > > > > > > > > >> >> > > raw format, once we adopt the new
> processing
> > > > model
> > > > > in
> > > > > > > the
> > > > > > > > > > send
> > > > > > > > > > > > >> phase.
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > Guozhang
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > On Thu, Dec 17, 2020 at 8:42 PM Boyang
> > Chen <
> > > > > > > > > > > > >> >> > reluctanthero...@gmail.com>
> > > > > > > > > > > > >> >> > > > wrote:
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > > Thanks for everyone's feedback so far.
> I
> > > have
> > > > > > > > polished
> > > > > > > > > > the
> > > > > > > > > > > > KIP
> > > > > > > > > > > > >> >> after
> > > > > > > > > > > > >> >> > > > > offline discussion with some folks
> > working
> > > on
> > > > > EOS
> > > > > > > to
> > > > > > > > > > make the
> > > > > > > > > > > > >> >> > exception
> > > > > > > > > > > > >> >> > > > > handling more lightweight. The
> essential
> > > > change
> > > > > > is
> > > > > > > > that
> > > > > > > > > > we
> > > > > > > > > > > > are
> > > > > > > > > > > > >> not
> > > > > > > > > > > > >> >> > > > > inventing a new intermediate exception
> > > type,
> > > > > but
> > > > > > > > > instead
> > > > > > > > > > > > >> >> separating a
> > > > > > > > > > > > >> >> > > > full
> > > > > > > > > > > > >> >> > > > > transaction into two phases:
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > > > 1. The data transmission phase
> > > > > > > > > > > > >> >> > > > > 2. The commit phase
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > > > This way for any exception thrown from
> > > phase
> > > > 1,
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > an
> > > > > > > > > > > > >> >> indicator
> > > > > > > > > > > > >> >> > in
> > > > > > > > > > > > >> >> > > > > phase 2 whether we should do commit or
> > > abort,
> > > > > and
> > > > > > > > from
> > > > > > > > > > now on
> > > > > > > > > > > > >> >> > > > > `commitTransaction` should only throw
> > fatal
> > > > > > > > exceptions,
> > > > > > > > > > > > >> similar to
> > > > > > > > > > > > >> >> > > > > `abortTransaction`, so that any
> > > > KafkaException
> > > > > > > caught
> > > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > >> >> commit
> > > > > > > > > > > > >> >> > > phase
> > > > > > > > > > > > >> >> > > > > will be definitely fatal to crash the
> > app.
> > > > For
> > > > > > more
> > > > > > > > > > advanced
> > > > > > > > > > > > >> users
> > > > > > > > > > > > >> >> > such
> > > > > > > > > > > > >> >> > > > as
> > > > > > > > > > > > >> >> > > > > Streams, we have the ability to further
> > > wrap
> > > > > > > selected
> > > > > > > > > > types
> > > > > > > > > > > > of
> > > > > > > > > > > > >> >> fatal
> > > > > > > > > > > > >> >> > > > > exceptions to trigger task migration if
> > > > > > necessary.
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > > > More details in the KIP, feel free to
> > take
> > > > > > another
> > > > > > > > > look,
> > > > > > > > > > > > >> thanks!
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > > > On Thu, Dec 17, 2020 at 8:36 PM Boyang
> > > Chen <
> > > > > > > > > > > > >> >> > > reluctanthero...@gmail.com>
> > > > > > > > > > > > >> >> > > > > wrote:
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > > > > Thanks Bruno for the feedback.
> > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > >> >> > > > > > On Mon, Dec 7, 2020 at 5:26 AM Bruno
> > > > Cadonna
> > > > > <
> > > > > > > > > > > > >> >> br...@confluent.io>
> > > > > > > > > > > > >> >> > > > wrote:
> > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > >> >> > > > > >> Thanks Boyang for the KIP!
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> Like Matthias, I do also not know
> the
> > > > > producer
> > > > > > > > > > internal
> > > > > > > > > > > > well
> > > > > > > > > > > > >> >> > enough
> > > > > > > > > > > > >> >> > > to
> > > > > > > > > > > > >> >> > > > > >> comment on the categorization.
> > However,
> > > I
> > > > > > think
> > > > > > > > > > having a
> > > > > > > > > > > > >> super
> > > > > > > > > > > > >> >> > > > exception
> > > > > > > > > > > > >> >> > > > > >> (e.g. RetriableException) that
> > encodes
> > > if
> > > > > an
> > > > > > > > > > exception is
> > > > > > > > > > > > >> >> fatal
> > > > > > > > > > > > >> >> > or
> > > > > > > > > > > > >> >> > > > not
> > > > > > > > > > > > >> >> > > > > >> is cleaner, better understandable
> and
> > > less
> > > > > > > > > > error-prone,
> > > > > > > > > > > > >> because
> > > > > > > > > > > > >> >> > > > ideally
> > > > > > > > > > > > >> >> > > > > >> when you add a new non-fatal
> exception
> > > in
> > > > > > future
> > > > > > > > you
> > > > > > > > > > just
> > > > > > > > > > > > >> need
> > > > > > > > > > > > >> >> to
> > > > > > > > > > > > >> >> > > > think
> > > > > > > > > > > > >> >> > > > > >> about letting it inherit from the
> > super
> > > > > > > exception
> > > > > > > > > and
> > > > > > > > > > all
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> rest
> > > > > > > > > > > > >> >> > > of
> > > > > > > > > > > > >> >> > > > > >> the code will just behave correctly
> > > > without
> > > > > > the
> > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > > wrap
> > > > > > > > > > > > >> >> the
> > > > > > > > > > > > >> >> > new
> > > > > > > > > > > > >> >> > > > > >> exception into another exception
> each
> > > time
> > > > > it
> > > > > > is
> > > > > > > > > > thrown
> > > > > > > > > > > > >> (maybe
> > > > > > > > > > > > >> >> it
> > > > > > > > > > > > >> >> > is
> > > > > > > > > > > > >> >> > > > > >> thrown at different location in the
> > > code).
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> As far as I understand the following
> > > > > statement
> > > > > > > > from
> > > > > > > > > > your
> > > > > > > > > > > > >> >> previous
> > > > > > > > > > > > >> >> > > > e-mail
> > > > > > > > > > > > >> >> > > > > >> is the reason that currently such a
> > > super
> > > > > > > > exception
> > > > > > > > > > is not
> > > > > > > > > > > > >> >> > possible:
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> "Right now we have
> RetriableException
> > > > type,
> > > > > if
> > > > > > > we
> > > > > > > > > are
> > > > > > > > > > > > going
> > > > > > > > > > > > >> to
> > > > > > > > > > > > >> >> > add a
> > > > > > > > > > > > >> >> > > > > >> `ProducerRetriableException` type,
> we
> > > have
> > > > > to
> > > > > > > put
> > > > > > > > > > this new
> > > > > > > > > > > > >> >> > interface
> > > > > > > > > > > > >> >> > > > as
> > > > > > > > > > > > >> >> > > > > >> the parent of the
> RetriableException,
> > > > > because
> > > > > > > not
> > > > > > > > > all
> > > > > > > > > > > > thrown
> > > > > > > > > > > > >> >> > > non-fatal
> > > > > > > > > > > > >> >> > > > > >> exceptions are `retriable` in
> general
> > > for
> > > > > > > > producer"
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> In the list of exceptions in your
> > KIP, I
> > > > > found
> > > > > > > > > > non-fatal
> > > > > > > > > > > > >> >> > exceptions
> > > > > > > > > > > > >> >> > > > that
> > > > > > > > > > > > >> >> > > > > >> do not inherit from
> > RetriableException.
> > > I
> > > > > > guess
> > > > > > > > > those
> > > > > > > > > > are
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> ones
> > > > > > > > > > > > >> >> > > you
> > > > > > > > > > > > >> >> > > > > >> are referring to in your statement:
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> InvalidProducerEpochException
> > > > > > > > > > > > >> >> > > > > >> InvalidPidMappingException
> > > > > > > > > > > > >> >> > > > > >> TransactionAbortedException
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> All of those exceptions are
> non-fatal
> > > and
> > > > do
> > > > > > not
> > > > > > > > > > inherit
> > > > > > > > > > > > >> from
> > > > > > > > > > > > >> >> > > > > >> RetriableException. Is there a
> reason
> > > for
> > > > > > that?
> > > > > > > If
> > > > > > > > > > they
> > > > > > > > > > > > >> >> depended
> > > > > > > > > > > > >> >> > > from
> > > > > > > > > > > > >> >> > > > > >> RetriableException we would be a bit
> > > > closer
> > > > > > to a
> > > > > > > > > super
> > > > > > > > > > > > >> >> exception I
> > > > > > > > > > > > >> >> > > > > >> mention above.
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> The reason is that sender may catch
> > > those
> > > > > > > > exceptions
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > >> >> > > > > > ProduceResponse, and it currently
> does
> > > > > infinite
> > > > > > > > > > > > >> >> > > > > > retries on RetriableException. To
> make
> > > sure
> > > > > we
> > > > > > > > could
> > > > > > > > > > > > trigger
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> > > > > > abortTransaction() by catching
> > non-fatal
> > > > > thrown
> > > > > > > > > > > > >> >> > > > > > exceptions and reinitialize the txn
> > > state,
> > > > we
> > > > > > > chose
> > > > > > > > > > not to
> > > > > > > > > > > > >> let
> > > > > > > > > > > > >> >> > those
> > > > > > > > > > > > >> >> > > > > > exceptions inherit RetriableException
> > so
> > > > that
> > > > > > > > > > > > >> >> > > > > > they won't cause infinite retry on
> > > sender.
> > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > >> >> > > > > >> With OutOfOrderSequenceException and
> > > > > > > > > > > > >> >> UnknownProducerIdException, I
> > > > > > > > > > > > >> >> > > > think
> > > > > > > > > > > > >> >> > > > > >> to understand that their fatality
> > > depends
> > > > on
> > > > > > the
> > > > > > > > > type
> > > > > > > > > > > > (i.e.
> > > > > > > > > > > > >> >> > > > > >> configuration) of the producer. That
> > > makes
> > > > > it
> > > > > > > > > > difficult to
> > > > > > > > > > > > >> >> have a
> > > > > > > > > > > > >> >> > > > super
> > > > > > > > > > > > >> >> > > > > >> exception that encodes the
> > retriability
> > > as
> > > > > > > > mentioned
> > > > > > > > > > > > above.
> > > > > > > > > > > > >> >> Would
> > > > > > > > > > > > >> >> > it
> > > > > > > > > > > > >> >> > > > be
> > > > > > > > > > > > >> >> > > > > >> possible to introduce exceptions
> that
> > > > > inherit
> > > > > > > from
> > > > > > > > > > > > >> >> > > RetriableException
> > > > > > > > > > > > >> >> > > > > >> and that are thrown when those
> > > exceptions
> > > > > are
> > > > > > > > caught
> > > > > > > > > > from
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> > > brokers
> > > > > > > > > > > > >> >> > > > > >> and the type of the producer is such
> > > that
> > > > > the
> > > > > > > > > > exceptions
> > > > > > > > > > > > are
> > > > > > > > > > > > >> >> > > > retriable?
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> Yea, I think in general the
> exception
> > > type
> > > > > > > mixing
> > > > > > > > is
> > > > > > > > > > > > >> difficult
> > > > > > > > > > > > >> >> to
> > > > > > > > > > > > >> >> > > get
> > > > > > > > > > > > >> >> > > > > > them all right. I have already
> proposed
> > > > > another
> > > > > > > > > > solution
> > > > > > > > > > > > >> based
> > > > > > > > > > > > >> >> on
> > > > > > > > > > > > >> >> > my
> > > > > > > > > > > > >> >> > > > > > offline discussion with some folks
> > > working
> > > > on
> > > > > > EOS
> > > > > > > > > > > > >> >> > > > > > to make the handling more
> > straightforward
> > > > for
> > > > > > end
> > > > > > > > > users
> > > > > > > > > > > > >> without
> > > > > > > > > > > > >> >> the
> > > > > > > > > > > > >> >> > > > need
> > > > > > > > > > > > >> >> > > > > > to distinguish exception fatality.
> > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > >> >> > > > > >> Best,
> > > > > > > > > > > > >> >> > > > > >> Bruno
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >> On 04.12.20 19:34, Guozhang Wang
> > wrote:
> > > > > > > > > > > > >> >> > > > > >> > Thanks Boyang for the proposal! I
> > > made a
> > > > > > pass
> > > > > > > > over
> > > > > > > > > > the
> > > > > > > > > > > > >> list
> > > > > > > > > > > > >> >> and
> > > > > > > > > > > > >> >> > > here
> > > > > > > > > > > > >> >> > > > > are
> > > > > > > > > > > > >> >> > > > > >> > some thoughts:
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > 0) Although this is not part of
> the
> > > > public
> > > > > > > API,
> > > > > > > > I
> > > > > > > > > > think
> > > > > > > > > > > > we
> > > > > > > > > > > > >> >> > should
> > > > > > > > > > > > >> >> > > > make
> > > > > > > > > > > > >> >> > > > > >> sure
> > > > > > > > > > > > >> >> > > > > >> > that the suggested pattern (i.e.
> > user
> > > > can
> > > > > > > always
> > > > > > > > > > call
> > > > > > > > > > > > >> >> abortTxn()
> > > > > > > > > > > > >> >> > > > when
> > > > > > > > > > > > >> >> > > > > >> > handling non-fatal errors) are
> > indeed
> > > > > > > supported.
> > > > > > > > > > E.g. if
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> txn
> > > > > > > > > > > > >> >> > > is
> > > > > > > > > > > > >> >> > > > > >> already
> > > > > > > > > > > > >> >> > > > > >> > aborted by the broker side, then
> > users
> > > > can
> > > > > > > still
> > > > > > > > > > call
> > > > > > > > > > > > >> >> abortTxn
> > > > > > > > > > > > >> >> > > which
> > > > > > > > > > > > >> >> > > > > >> would
> > > > > > > > > > > > >> >> > > > > >> > not throw another exception but
> just
> > > be
> > > > > > > treated
> > > > > > > > > as a
> > > > > > > > > > > > >> no-op.
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > 1)
> > *ConcurrentTransactionsException*:
> > > I
> > > > > > think
> > > > > > > > this
> > > > > > > > > > error
> > > > > > > > > > > > >> can
> > > > > > > > > > > > >> >> > also
> > > > > > > > > > > > >> >> > > be
> > > > > > > > > > > > >> >> > > > > >> > returned but not documented yet.
> > This
> > > > > should
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > > > >> non-fatal
> > > > > > > > > > > > >> >> > error.
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > 2) *InvalidTxnStateException*:
> this
> > > > error
> > > > > is
> > > > > > > > > > returned
> > > > > > > > > > > > from
> > > > > > > > > > > > >> >> > broker
> > > > > > > > > > > > >> >> > > > when
> > > > > > > > > > > > >> >> > > > > >> txn
> > > > > > > > > > > > >> >> > > > > >> > state transition failed (e.g. it
> is
> > > > trying
> > > > > > to
> > > > > > > > > > transit to
> > > > > > > > > > > > >> >> > > > > complete-commit
> > > > > > > > > > > > >> >> > > > > >> > while the current state is not
> > > > > > > prepare-commit).
> > > > > > > > > This
> > > > > > > > > > > > error
> > > > > > > > > > > > >> >> could
> > > > > > > > > > > > >> >> > > > > >> indicates
> > > > > > > > > > > > >> >> > > > > >> > a bug on the client internal code
> or
> > > the
> > > > > > > broker
> > > > > > > > > > code,
> > > > > > > > > > > > OR a
> > > > > > > > > > > > >> >> user
> > > > > > > > > > > > >> >> > > > error
> > > > > > > > > > > > >> >> > > > > >> --- a
> > > > > > > > > > > > >> >> > > > > >> > similar error is
> > > > > > > > ConcurrentTransactionsException,
> > > > > > > > > > i.e.
> > > > > > > > > > > > if
> > > > > > > > > > > > >> >> Kafka
> > > > > > > > > > > > >> >> > is
> > > > > > > > > > > > >> >> > > > > >> bug-free
> > > > > > > > > > > > >> >> > > > > >> > these exceptions would only be
> > > returned
> > > > if
> > > > > > > users
> > > > > > > > > > try to
> > > > > > > > > > > > do
> > > > > > > > > > > > >> >> > > something
> > > > > > > > > > > > >> >> > > > > >> wrong,
> > > > > > > > > > > > >> >> > > > > >> > e.g. calling abortTxn right after
> a
> > > > > > commitTxn,
> > > > > > > > > etc.
> > > > > > > > > > So
> > > > > > > > > > > > I'm
> > > > > > > > > > > > >> >> > > thinking
> > > > > > > > > > > > >> >> > > > it
> > > > > > > > > > > > >> >> > > > > >> > should be a non-fatal error
> instead
> > > of a
> > > > > > fatal
> > > > > > > > > > error,
> > > > > > > > > > > > >> wdyt?
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > 3) *KafkaException*: case i
> > "indicates
> > > > > fatal
> > > > > > > > > > > > transactional
> > > > > > > > > > > > >> >> > > sequence
> > > > > > > > > > > > >> >> > > > > >> > (Fatal)", this is a bit
> conflicting
> > > with
> > > > > the
> > > > > > > > > > > > >> >> > > > *OutOfSequenceException*
> > > > > > > > > > > > >> >> > > > > >> that
> > > > > > > > > > > > >> >> > > > > >> > is treated as non-fatal. I guess
> > your
> > > > > > proposal
> > > > > > > > is
> > > > > > > > > > that
> > > > > > > > > > > > >> >> > > > > >> > OutOfOrderSequenceException would
> be
> > > > > treated
> > > > > > > > > either
> > > > > > > > > > as
> > > > > > > > > > > > >> fatal
> > > > > > > > > > > > >> >> > with
> > > > > > > > > > > > >> >> > > > > >> > transactional producer, or
> non-fatal
> > > > with
> > > > > > > > > idempotent
> > > > > > > > > > > > >> >> producer,
> > > > > > > > > > > > >> >> > is
> > > > > > > > > > > > >> >> > > > that
> > > > > > > > > > > > >> >> > > > > >> > right? If the producer is only
> > > > configured
> > > > > > with
> > > > > > > > > > > > idempotency
> > > > > > > > > > > > >> >> but
> > > > > > > > > > > > >> >> > not
> > > > > > > > > > > > >> >> > > > > >> > transaction, then throwing a
> > > > > > > > > > > > >> >> TransactionStateCorruptedException
> > > > > > > > > > > > >> >> > > for
> > > > > > > > > > > > >> >> > > > > >> > non-fatal errors would be
> confusing
> > > > since
> > > > > > > users
> > > > > > > > > are
> > > > > > > > > > not
> > > > > > > > > > > > >> using
> > > > > > > > > > > > >> >> > > > > >> transactions
> > > > > > > > > > > > >> >> > > > > >> > at all.. So I suggest we always
> > throw
> > > > > > > > > > > > >> OutOfSequenceException
> > > > > > > > > > > > >> >> > as-is
> > > > > > > > > > > > >> >> > > > > (i.e.
> > > > > > > > > > > > >> >> > > > > >> > not wrapped) no matter how the
> > > producer
> > > > is
> > > > > > > > > > configured,
> > > > > > > > > > > > and
> > > > > > > > > > > > >> >> let
> > > > > > > > > > > > >> >> > the
> > > > > > > > > > > > >> >> > > > > >> caller
> > > > > > > > > > > > >> >> > > > > >> > decide how to handle it based on
> > > whether
> > > > > it
> > > > > > is
> > > > > > > > > only
> > > > > > > > > > > > >> >> idempotent
> > > > > > > > > > > > >> >> > or
> > > > > > > > > > > > >> >> > > > > >> > transactional itself.
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > 4) Besides all the txn APIs, the
> > > > `send()`
> > > > > > > > > callback /
> > > > > > > > > > > > >> future
> > > > > > > > > > > > >> >> can
> > > > > > > > > > > > >> >> > > also
> > > > > > > > > > > > >> >> > > > > >> throw
> > > > > > > > > > > > >> >> > > > > >> > txn-related exceptions, I think
> this
> > > KIP
> > > > > > > should
> > > > > > > > > also
> > > > > > > > > > > > cover
> > > > > > > > > > > > >> >> this
> > > > > > > > > > > > >> >> > > API
> > > > > > > > > > > > >> >> > > > as
> > > > > > > > > > > > >> >> > > > > >> well?
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > 5) This is related to 1/2) above:
> > > > > sometimes
> > > > > > > > those
> > > > > > > > > > > > >> non-fatal
> > > > > > > > > > > > >> >> > errors
> > > > > > > > > > > > >> >> > > > > like
> > > > > > > > > > > > >> >> > > > > >> > ConcurrentTxn or InvalidTxnState
> are
> > > not
> > > > > due
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > state
> > > > > > > > > > > > >> >> being
> > > > > > > > > > > > >> >> > > > > >> corrupted
> > > > > > > > > > > > >> >> > > > > >> > at the broker side, but maybe
> users
> > > are
> > > > > > doing
> > > > > > > > > > something
> > > > > > > > > > > > >> >> wrong.
> > > > > > > > > > > > >> >> > So
> > > > > > > > > > > > >> >> > > > I'm
> > > > > > > > > > > > >> >> > > > > >> > wondering if we should further
> > > > distinguish
> > > > > > > those
> > > > > > > > > > > > non-fatal
> > > > > > > > > > > > >> >> > errors
> > > > > > > > > > > > >> >> > > > > >> between
> > > > > > > > > > > > >> >> > > > > >> > a) those that are caused by Kafka
> > > > itself,
> > > > > > > e.g. a
> > > > > > > > > > broker
> > > > > > > > > > > > >> timed
> > > > > > > > > > > > >> >> > out
> > > > > > > > > > > > >> >> > > > and
> > > > > > > > > > > > >> >> > > > > >> > aborted a txn and later an endTxn
> > > > request
> > > > > is
> > > > > > > > > > received,
> > > > > > > > > > > > >> and b)
> > > > > > > > > > > > >> >> > the
> > > > > > > > > > > > >> >> > > > > user's
> > > > > > > > > > > > >> >> > > > > >> > API call pattern is incorrect,
> > causing
> > > > the
> > > > > > > > request
> > > > > > > > > > to be
> > > > > > > > > > > > >> >> > rejected
> > > > > > > > > > > > >> >> > > > with
> > > > > > > > > > > > >> >> > > > > >> an
> > > > > > > > > > > > >> >> > > > > >> > error code from the broker.
> > > > > > > > > > > > >> >> *TransactionStateCorruptedException*
> > > > > > > > > > > > >> >> > > > feels
> > > > > > > > > > > > >> >> > > > > >> to
> > > > > > > > > > > > >> >> > > > > >> > me more like for case a), but not
> > case
> > > > b).
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > Guozhang
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> > On Wed, Dec 2, 2020 at 4:50 PM
> > Boyang
> > > > > Chen <
> > > > > > > > > > > > >> >> > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > > >> >> > > > > >> > wrote:
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> >> Thanks Matthias, I think your
> > > proposal
> > > > > > makes
> > > > > > > > > sense
> > > > > > > > > > as
> > > > > > > > > > > > >> well,
> > > > > > > > > > > > >> >> on
> > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > >> >> > > > > pro
> > > > > > > > > > > > >> >> > > > > >> side
> > > > > > > > > > > > >> >> > > > > >> >> we could have a universally
> agreed
> > > > > > exception
> > > > > > > > type
> > > > > > > > > > to be
> > > > > > > > > > > > >> >> caught
> > > > > > > > > > > > >> >> > by
> > > > > > > > > > > > >> >> > > > the
> > > > > > > > > > > > >> >> > > > > >> user,
> > > > > > > > > > > > >> >> > > > > >> >> without having an extra layer on
> > top
> > > of
> > > > > the
> > > > > > > > > actual
> > > > > > > > > > > > >> >> exceptions.
> > > > > > > > > > > > >> >> > I
> > > > > > > > > > > > >> >> > > > > could
> > > > > > > > > > > > >> >> > > > > >> see
> > > > > > > > > > > > >> >> > > > > >> >> some issue on downsides:
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >> 1. The exception hierarchy will
> be
> > > more
> > > > > > > > complex.
> > > > > > > > > > Right
> > > > > > > > > > > > >> now
> > > > > > > > > > > > >> >> we
> > > > > > > > > > > > >> >> > > have
> > > > > > > > > > > > >> >> > > > > >> >> RetriableException type, if we
> are
> > > > going
> > > > > to
> > > > > > > > add a
> > > > > > > > > > > > >> >> > > > > >> >> `ProducerRetriableException`
> type,
> > we
> > > > > have
> > > > > > to
> > > > > > > > put
> > > > > > > > > > this
> > > > > > > > > > > > >> new
> > > > > > > > > > > > >> >> > > > interface
> > > > > > > > > > > > >> >> > > > > >> as the
> > > > > > > > > > > > >> >> > > > > >> >> parent of the RetriableException,
> > > > because
> > > > > > not
> > > > > > > > all
> > > > > > > > > > > > thrown
> > > > > > > > > > > > >> >> > > non-fatal
> > > > > > > > > > > > >> >> > > > > >> >> exceptions are `retriable` in
> > general
> > > > for
> > > > > > > > > > producer, for
> > > > > > > > > > > > >> >> example
> > > > > > > > > > > > >> >> > > > > >> >> <
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/e275742f850af4a1b79b0d1bd1ac9a1d2e89c64e/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L745
> > > > > > > > > > > > >> >> > > > > >> >>> .
> > > > > > > > > > > > >> >> > > > > >> >> We could have an empty interface
> > > > > > > > > > > > >> >> `ProducerRetriableException`
> > > > > > > > > > > > >> >> > to
> > > > > > > > > > > > >> >> > > > let
> > > > > > > > > > > > >> >> > > > > >> all
> > > > > > > > > > > > >> >> > > > > >> >> the thrown exceptions implement
> for
> > > > sure,
> > > > > > but
> > > > > > > > > it's
> > > > > > > > > > a
> > > > > > > > > > > > bit
> > > > > > > > > > > > >> >> > > unorthodox
> > > > > > > > > > > > >> >> > > > > in
> > > > > > > > > > > > >> >> > > > > >> the
> > > > > > > > > > > > >> >> > > > > >> >> way we deal with exceptions in
> > > general.
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >> 2. There are cases where we
> throw a
> > > > > > > > > KafkaException
> > > > > > > > > > > > >> wrapping
> > > > > > > > > > > > >> >> > > another
> > > > > > > > > > > > >> >> > > > > >> >> KafkaException as either fatal or
> > > > > > non-fatal.
> > > > > > > If
> > > > > > > > > we
> > > > > > > > > > use
> > > > > > > > > > > > an
> > > > > > > > > > > > >> >> > > interface
> > > > > > > > > > > > >> >> > > > > to
> > > > > > > > > > > > >> >> > > > > >> >> solve #1, it is also required to
> > > > > implement
> > > > > > > > > another
> > > > > > > > > > > > >> bloated
> > > > > > > > > > > > >> >> > > > exception
> > > > > > > > > > > > >> >> > > > > >> class
> > > > > > > > > > > > >> >> > > > > >> >> which could replace
> KafkaException
> > > > type,
> > > > > as
> > > > > > > we
> > > > > > > > > > couldn't
> > > > > > > > > > > > >> mark
> > > > > > > > > > > > >> >> > > > > >> KafkaException
> > > > > > > > > > > > >> >> > > > > >> >> as retriable for sure.
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >> 3. In terms of the encapsulation,
> > > > > wrapping
> > > > > > > > means
> > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > >> >> limit
> > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > >> >> > > > > >> scope
> > > > > > > > > > > > >> >> > > > > >> >> of affection to the producer
> only,
> > > > which
> > > > > is
> > > > > > > > > > important
> > > > > > > > > > > > >> since
> > > > > > > > > > > > >> >> we
> > > > > > > > > > > > >> >> > > > don't
> > > > > > > > > > > > >> >> > > > > >> want
> > > > > > > > > > > > >> >> > > > > >> >> shared exception types to
> > implement a
> > > > > > > > > > producer-related
> > > > > > > > > > > > >> >> > interface,
> > > > > > > > > > > > >> >> > > > > such
> > > > > > > > > > > > >> >> > > > > >> >> as
> > UnknownTopicOrPartitionException.
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >> Best,
> > > > > > > > > > > > >> >> > > > > >> >> Boyang
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >> On Wed, Dec 2, 2020 at 3:38 PM
> > > Matthias
> > > > > J.
> > > > > > > Sax
> > > > > > > > <
> > > > > > > > > > > > >> >> > mj...@apache.org
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > > >> wrote:
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >>> Thanks for the KIP Boyang!
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>> Overall, categorizing exceptions
> > > > makes a
> > > > > > lot
> > > > > > > > of
> > > > > > > > > > sense.
> > > > > > > > > > > > >> As I
> > > > > > > > > > > > >> >> > > don't
> > > > > > > > > > > > >> >> > > > > know
> > > > > > > > > > > > >> >> > > > > >> >>> the producer internals well
> > enough,
> > > I
> > > > > > cannot
> > > > > > > > > > comment
> > > > > > > > > > > > on
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> >> > > > > >> >>> categorization in detail though.
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>> What I am wondering is, if we
> > should
> > > > > > > introduce
> > > > > > > > > an
> > > > > > > > > > > > >> exception
> > > > > > > > > > > > >> >> > > > > interface
> > > > > > > > > > > > >> >> > > > > >> >>> that non-fatal exception would
> > > > implement
> > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > >> >> creating a
> > > > > > > > > > > > >> >> > > new
> > > > > > > > > > > > >> >> > > > > >> class
> > > > > > > > > > > > >> >> > > > > >> >>> that will wrap non-fatal
> > exceptions?
> > > > > What
> > > > > > > > would
> > > > > > > > > > be the
> > > > > > > > > > > > >> >> > pros/cons
> > > > > > > > > > > > >> >> > > > for
> > > > > > > > > > > > >> >> > > > > >> >>> both designs?
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>> -Matthias
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>> On 12/2/20 11:35 AM, Boyang Chen
> > > > wrote:
> > > > > > > > > > > > >> >> > > > > >> >>>> Hey there,
> > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > >> >> > > > > >> >>>> I would like to start a
> > discussion
> > > > > thread
> > > > > > > for
> > > > > > > > > > > > KIP-691:
> > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-691%3A+Enhance+Transactional+Producer+Exception+Handling
> > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > >> >> > > > > >> >>>> The KIP is aiming to simplify
> the
> > > > > > exception
> > > > > > > > > > handling
> > > > > > > > > > > > >> logic
> > > > > > > > > > > > >> >> > for
> > > > > > > > > > > > >> >> > > > > >> >>>> transactional Producer users by
> > > > > > classifying
> > > > > > > > > > fatal and
> > > > > > > > > > > > >> >> > non-fatal
> > > > > > > > > > > > >> >> > > > > >> >>> exceptions
> > > > > > > > > > > > >> >> > > > > >> >>>> and throw them correspondingly
> > for
> > > > > easier
> > > > > > > > catch
> > > > > > > > > > and
> > > > > > > > > > > > >> retry.
> > > > > > > > > > > > >> >> > Let
> > > > > > > > > > > > >> >> > > me
> > > > > > > > > > > > >> >> > > > > >> know
> > > > > > > > > > > > >> >> > > > > >> >>> what
> > > > > > > > > > > > >> >> > > > > >> >>>> you think.
> > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > >> >> > > > > >> >>>> Best,
> > > > > > > > > > > > >> >> > > > > >> >>>> Boyang
> > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > > > --
> > > > > > > > > > > > >> >> > > > -- Guozhang
> > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >> > --
> > > > > > > > > > > > >> >> > -- Guozhang
> > > > > > > > > > > > >> >> >
> > > > > > > > > > > > >> >>
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > --
> > > > > > > > > > > > >> > -- Guozhang
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> --
> > > > > > > > > > > > >> -- Guozhang
> > > > > > > > > > > > >>
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > -- Guozhang
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Reply via email to