Thanks Jason!

On Tue, Feb 2, 2021 at 11:10 PM Jason Gustafson <ja...@confluent.io> wrote:

> Hi Boyang, sounds good to me.
>
> -Jason
>
> On Mon, Feb 1, 2021 at 9:32 AM Boyang Chen <reluctanthero...@gmail.com>
> wrote:
>
> > Hey Jason and Chia-Ping,
> >
> > looking at the discussion thread for KIP-706, we are still in discussion
> > with the final API version, so I made the decision to decouple the two
> KIPs
> > so that this one could move forward. To be specific, we are still
> > introducing ProduceFailed error but as a `near-future` improvement, which
> > should be implemented right after KIP-706 is finalized.
> >
> > Let me know your thoughts.
> >
> > Best,
> > Boyang
> >
> > On Sat, Jan 30, 2021 at 11:57 AM Chia-Ping Tsai <chia7...@apache.org>
> > wrote:
> >
> > > > a richer return type. Let me know if you are good with this, and
> > whether
> > > > Chia-Ping is also happy here :)
> > >
> > > sure. I'd like to join this bigger party :)
> > >
> > > On 2021/01/30 01:03:57, Boyang Chen <reluctanthero...@gmail.com>
> wrote:
> > > > This is a great proposal Jason, I already integrated KIP-691 with
> > KIP-706
> > > > template to provide a new ProduceFailedException as well as a new
> > > > Producer#produce API with CompletionStage as old send API replacement
> > > with
> > > > a richer return type. Let me know if you are good with this, and
> > whether
> > > > Chia-Ping is also happy here :)
> > > >
> > > > Best,
> > > > Boyang
> > > >
> > > > On Fri, Jan 29, 2021 at 4:23 PM Jason Gustafson <ja...@confluent.io>
> > > wrote:
> > > >
> > > > > Hi Boyang,
> > > > >
> > > > > Ok, if we are going to ask all users to update their code anyway,
> > > maybe we
> > > > > can get more bang out of this.
> > > > >
> > > > > One of the remaining problems is that a user will only get the
> richer
> > > > > information if they are supplying a Callback. If they wait on the
> > > future,
> > > > > then they will get the usual exception. Our hands are somewhat tied
> > > here
> > > > > because of compatibility. One thing we have wanted to do for a long
> > > time is
> > > > > introduce a new send method which returns
> > > > > `CompletableFuture<RecordMetadata>` instead. If we did so, then we
> > > wouldn't
> > > > > need to change `Callback` at all. Instead we could make the
> contract
> > > of the
> > > > > new api that it will always return `SendFailedException`, which
> could
> > > > > provide a method to get the failure type.
> > > > >
> > > > > I think Chia-Ping has been thinking about this:
> > > > > https://issues.apache.org/jira/browse/KAFKA-12227. Is there a way
> to
> > > > > consolidate our efforts in order to avoid unnecessary noise for
> > users?
> > > It
> > > > > would be awesome to see a more modern API which also addresses the
> > > > > confusing error handling.
> > > > >
> > > > > -Jason
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Jan 29, 2021 at 3:48 PM Boyang Chen <
> > > reluctanthero...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks Jason, my hope is that we could streamline the callback
> > > function
> > > > > > signature, so that in the next major release (3.0),
> > > > > > onCompletion(RecordMetadata metadata, SendFailure sendFailure)
> will
> > > > > become
> > > > > > the only API to be implemented.
> > > > > >
> > > > > > On Fri, Jan 29, 2021 at 3:42 PM Jason Gustafson <
> > ja...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Boyang,
> > > > > > >
> > > > > > > Do you think it's necessary to deprecate the old `onCompletion`
> > > > > > callback? I
> > > > > > > was thinking there's probably no harm leaving it around. Users
> > > might
> > > > > not
> > > > > > > care about the failure type. Other than that, it looks good to
> > me.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Jason
> > > > > > >
> > > > > > > On Thu, Jan 28, 2021 at 6:58 PM Boyang Chen <
> > > > > reluctanthero...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey Guozhang,
> > > > > > > >
> > > > > > > > the CommitFailedException wrapping would cover all the
> > non-fatal
> > > > > > > exceptions
> > > > > > > > as we listed out in the KIP, generally speaking any exception
> > > that
> > > > > > could
> > > > > > > > recover safely by calling abortTxn should be wrapped.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Boyang
> > > > > > > >
> > > > > > > > On Thu, Jan 28, 2021 at 5:22 PM Guozhang Wang <
> > > wangg...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Boyang,
> > > > > > > > >
> > > > > > > > > I think maybe there's something cracking here :) I'm just
> > > asking
> > > > > for
> > > > > > > > > clarifications that as of today, which non-fatal exceptions
> > the
> > > > > newly
> > > > > > > > > introduced CommitFailedException would cover, and it seems
> to
> > > be
> > > > > only
> > > > > > > 1)
> > > > > > > > > unknown pid, 2) invalid pid mapping, and 3) concurrent
> > > > > transactions.
> > > > > > Is
> > > > > > > > > that correct?
> > > > > > > > >
> > > > > > > > > On Thu, Jan 28, 2021 at 5:06 PM Boyang Chen <
> > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hey Guozhang, I think TimeoutException would not be
> covered
> > > here
> > > > > as
> > > > > > > it
> > > > > > > > > > potentially has a risk of hitting an illegal state on the
> > > broker
> > > > > > side
> > > > > > > > > when
> > > > > > > > > > the previous commit was actually successful. Users should
> > > try to
> > > > > > > > increase
> > > > > > > > > > their max.block.ms to avoid hitting the timeout as a
> base
> > > > > > > suggestion,
> > > > > > > > > > which
> > > > > > > > > > is discussed in the KIP.
> > > > > > > > > >
> > > > > > > > > > As for the umbrella exception, I agree it has values to
> > some
> > > > > > extent,
> > > > > > > > but
> > > > > > > > > in
> > > > > > > > > > terms of adoption difficulty, additional exception types
> > > usually
> > > > > > make
> > > > > > > > the
> > > > > > > > > > error handling more complex than simplified, and we are
> > > doing our
> > > > > > > best
> > > > > > > > to
> > > > > > > > > > avoid compatibility issues caused by wrapping previously
> > > thrown
> > > > > raw
> > > > > > > > > > exceptions that invalidates user error handling. And I'm
> > > also not
> > > > > > in
> > > > > > > > > favor
> > > > > > > > > > of our exception hierarchy today where all exceptions are
> > > > > > subclasses
> > > > > > > of
> > > > > > > > > > KafkaException, and we already mixed its handling with
> > other
> > > > > > specific
> > > > > > > > > > exception types that could be caught as superclass
> > > > > KafkaException.
> > > > > > > That
> > > > > > > > > > could be addressed in a separate proposal if we see
> > > necessary.
> > > > > > > > > >
> > > > > > > > > > Boyang
> > > > > > > > > >
> > > > > > > > > > On Thu, Jan 28, 2021 at 3:28 PM Guozhang Wang <
> > > > > wangg...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks Jason for the suggestion, that looks good to me
> > too.
> > > > > > > > > > >
> > > > > > > > > > > Regarding the non-fatal exceptions wrapped as
> > > CommitFailed, I
> > > > > > would
> > > > > > > > > like
> > > > > > > > > > to
> > > > > > > > > > > clarify if we would cover all the following cases: 1)
> > > timeout,
> > > > > 2)
> > > > > > > > > unknown
> > > > > > > > > > > pid, 3) invalid pid mapping, 4) concurrent
> transactions?
> > > > > > > > > > >
> > > > > > > > > > > BTW I think it still makes sense to use an umbrella
> > > exception
> > > > > in
> > > > > > > case
> > > > > > > > > in
> > > > > > > > > > > the future we add more non-fatal cases even though
> today
> > we
> > > > > only
> > > > > > > > have a
> > > > > > > > > > > few.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Guozhang
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jan 28, 2021 at 1:08 PM Boyang Chen <
> > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks Jason, I agree with the proposed solution
> here,
> > > will
> > > > > > > update
> > > > > > > > > the
> > > > > > > > > > > KIP.
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Jan 28, 2021 at 10:52 AM Jason Gustafson <
> > > > > > > > ja...@confluent.io
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Boyang,
> > > > > > > > > > > > >
> > > > > > > > > > > > > It seems like a reasonable suggestion. I wonder if
> a
> > > flag
> > > > > is
> > > > > > > > > > sufficient
> > > > > > > > > > > > > though. The current `Callback` documentation treats
> > > "fatal"
> > > > > > > > errors
> > > > > > > > > > from
> > > > > > > > > > > > the
> > > > > > > > > > > > > perspective of the individual message that was
> sent.
> > > > > > > > > > > > >
> > > > > > > > > > > > > ```
> > > > > > > > > > > > >     *                  Non-Retriable exceptions
> > > (fatal, the
> > > > > > > > message
> > > > > > > > > > > will
> > > > > > > > > > > > > never be sent):
> > > > > > > > > > > > > ```
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, we also have fatal errors from the
> > > perspective of
> > > > > > the
> > > > > > > > > > > > transaction
> > > > > > > > > > > > > (e.g. when the producer gets fenced). Perhaps that
> > > suggests
> > > > > > we
> > > > > > > > need
> > > > > > > > > > > > > something richer than a boolean flag:
> > > > > > > > > > > > >
> > > > > > > > > > > > > At a high level, I think the following cases are
> > > possible:
> > > > > > > > > > > > >
> > > > > > > > > > > > > - message rejected (e.g. message too large, invalid
> > > topic)
> > > > > > > > > > > > > - delivery failed after retries/delivery timeout
> > (e.g.
> > > > > > timeout,
> > > > > > > > crc
> > > > > > > > > > > > error,
> > > > > > > > > > > > > not enough replicas)
> > > > > > > > > > > > > - transaction failed (e.g. producer fenced, invalid
> > > > > > transaction
> > > > > > > > > > state)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Perhaps instead we can have a type like the
> > following:
> > > > > > > > > > > > >
> > > > > > > > > > > > > class SendFailure {
> > > > > > > > > > > > >   FailureType failureType;
> > > > > > > > > > > > >   Exception cause;
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > enum FailureType {
> > > > > > > > > > > > >   MESSSAGE_REJECTED, DELIVERY_FAILED,
> > > TRANSACTION_FAILED
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > (Not married to any of these names, just a starting
> > > point.)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Then we add a new `onCompletion` as you've
> suggested:
> > > > > > > > > > > > >
> > > > > > > > > > > > > default void onCompletion(RecordMetadata metadata,
> > > > > > SendFailure
> > > > > > > > > > > failure) {
> > > > > > > > > > > > >   this.onCompletion(metadata, failure.cause());
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > This would give streams and other applications
> enough
> > > > > > > information
> > > > > > > > > to
> > > > > > > > > > > know
> > > > > > > > > > > > > whether the message can be retried and whether the
> > > > > > transaction
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > > aborted.
> > > > > > > > > > > > >
> > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > >
> > > > > > > > > > > > > -Jason
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Jan 27, 2021 at 9:51 PM Boyang Chen <
> > > > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks Jason for the thoughts.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Jan 27, 2021 at 11:52 AM Jason Gustafson
> <
> > > > > > > > > > ja...@confluent.io
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Boyang,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the iterations here. I think this is
> > > > > something
> > > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > done a long time ago. It sounds like there are
> > two
> > > API
> > > > > > > > changes
> > > > > > > > > > > here:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. We are introducing the
> `CommitFailedException`
> > > to
> > > > > wrap
> > > > > > > > > > abortable
> > > > > > > > > > > > > > > errors that are raised from
> `commitTransaction`.
> > > This
> > > > > > > sounds
> > > > > > > > > fine
> > > > > > > > > > > to
> > > > > > > > > > > > > me.
> > > > > > > > > > > > > > As
> > > > > > > > > > > > > > > far as I know, the only case we might need this
> > is
> > > when
> > > > > > we
> > > > > > > > add
> > > > > > > > > > > > support
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > let producers recover from coordinator
> timeouts.
> > > Are
> > > > > > there
> > > > > > > > any
> > > > > > > > > > > > others?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think the purpose here is to ensure non-fatal
> > > > > > exceptions
> > > > > > > > are
> > > > > > > > > > > > unified
> > > > > > > > > > > > > > under the same
> > > > > > > > > > > > > > exception umbrella, to make the proceeding to
> abort
> > > any
> > > > > > > ongoing
> > > > > > > > > > > > > transaction
> > > > > > > > > > > > > > much easier.
> > > > > > > > > > > > > > I don't think `coordinator timeouts` is the only
> > > case to
> > > > > > > > recover
> > > > > > > > > > > here,
> > > > > > > > > > > > > > since we have other
> > > > > > > > > > > > > > non-fatal exceptions such as UnknownPid.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. We are wrapping non-fatal errors raised from
> > > `send` as
> > > > > > > > > > > > > `KafkaException`.
> > > > > > > > > > > > > > > The motivation for this is less clear to me and
> > it
> > > > > > doesn't
> > > > > > > > look
> > > > > > > > > > > like
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > example from the KIP depends on it. My concern
> > > here is
> > > > > > > > > > > compatibility.
> > > > > > > > > > > > > > > Currently we have the following documentation
> for
> > > the
> > > > > > > > > `Callback`
> > > > > > > > > > > API:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > >      *                  Non-Retriable
> exceptions
> > > > > (fatal,
> > > > > > > the
> > > > > > > > > > > message
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > never be sent):
> > > > > > > > > > > > > > >      *
> > > > > > > > > > > > > > >      *                  InvalidTopicException
> > > > > > > > > > > > > > >      *
> > > OffsetMetadataTooLargeException
> > > > > > > > > > > > > > >      *
> > > RecordBatchTooLargeException
> > > > > > > > > > > > > > >      *                  RecordTooLargeException
> > > > > > > > > > > > > > >      *                  UnknownServerException
> > > > > > > > > > > > > > >      *
> > UnknownProducerIdException
> > > > > > > > > > > > > > >      *
> > > InvalidProducerEpochException
> > > > > > > > > > > > > > >      *
> > > > > > > > > > > > > > >      *                  Retriable exceptions
> > > > > (transient,
> > > > > > > may
> > > > > > > > be
> > > > > > > > > > > > covered
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > increasing #.retries):
> > > > > > > > > > > > > > >      *
> > > > > > > > > > > > > > >      *                  CorruptRecordException
> > > > > > > > > > > > > > >      *
> InvalidMetadataException
> > > > > > > > > > > > > > >      *
> > > > > > > NotEnoughReplicasAfterAppendException
> > > > > > > > > > > > > > >      *
> > NotEnoughReplicasException
> > > > > > > > > > > > > > >      *
> OffsetOutOfRangeException
> > > > > > > > > > > > > > >      *                  TimeoutException
> > > > > > > > > > > > > > >      *
> > > > > UnknownTopicOrPartitionException
> > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we wrap all the retriable exceptions
> > documented
> > > here
> > > > > > as
> > > > > > > > > > > > > > > `KafkaException`, wouldn't that break any error
> > > > > handling
> > > > > > > that
> > > > > > > > > > users
> > > > > > > > > > > > > might
> > > > > > > > > > > > > > > already have? it's gonna introduce a
> > compatibility
> > > > > issue.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The original intention was to simplify `send`
> > > callback
> > > > > > > error
> > > > > > > > > > > handling
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > doing exception wrapping, as on Streams level
> > > > > > > > > > > > > > we have to prepare an exhausting list of
> exceptions
> > > to
> > > > > > catch
> > > > > > > as
> > > > > > > > > > > fatal,
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > the same lengthy list to catch as
> > > > > > > > > > > > > > non-fatal. It would be much easier if we got
> > `hints`
> > > from
> > > > > > the
> > > > > > > > > > > callback.
> > > > > > > > > > > > > > However,
> > > > > > > > > > > > > > I agree there is a compatibility concern, what
> > about
> > > > > > > > deprecating
> > > > > > > > > > the
> > > > > > > > > > > > > > existing:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > void onCompletion(RecordMetadata metadata,
> > Exception
> > > > > > > exception)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > and replace it with:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > default void onCompletion(RecordMetadata
> metadata,
> > > > > > Exception
> > > > > > > > > > > exception,
> > > > > > > > > > > > > > boolean isFatal) {
> > > > > > > > > > > > > >   this.onCompletion(metadata, exception);
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > to make sure new users get the benefit of
> > > understanding
> > > > > the
> > > > > > > > > > fatality
> > > > > > > > > > > > > based
> > > > > > > > > > > > > > on the info presented by the producer?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Jason
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sat, Jan 23, 2021 at 3:31 AM Hiringuru <
> > > > > > > > i...@hiringuru.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Why  we are receiving all emails kindly
> remove
> > us
> > > > > from
> > > > > > > > > > > > > > > > dev@kafka.apache.org we don't want to
> receive
> > > emails
> > > > > > > > > anymore.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > > > > > On 01/23/2021 4:14 AM Guozhang Wang <
> > > > > > > wangg...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks Boyang, yes I think I was confused
> > > about the
> > > > > > > > > different
> > > > > > > > > > > > > > handling
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > two abortTxn calls, and now I get it was
> not
> > > > > > > > intentional. I
> > > > > > > > > > > > think I
> > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > have more concerns.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Jan 22, 2021 at 1:12 PM Boyang
> Chen <
> > > > > > > > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for the clarification Guozhang, I
> > got
> > > your
> > > > > > > point
> > > > > > > > > > that
> > > > > > > > > > > we
> > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > have a consistent handling of fatal
> > > exceptions
> > > > > > being
> > > > > > > > > thrown
> > > > > > > > > > > > from
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > abortTxn. I modified the current template
> > to
> > > move
> > > > > > the
> > > > > > > > > fatal
> > > > > > > > > > > > > > exception
> > > > > > > > > > > > > > > > > > try-catch outside of the processing loop
> to
> > > make
> > > > > > sure
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > > > get
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > chance
> > > > > > > > > > > > > > > > > > to close consumer/producer modules. Let
> me
> > > know
> > > > > > what
> > > > > > > > you
> > > > > > > > > > > think.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Boyang
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Jan 22, 2021 at 11:05 AM Boyang
> > Chen
> > > <
> > > > > > > > > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > My understanding is that
> abortTransaction
> > > would
> > > > > > > only
> > > > > > > > > > throw
> > > > > > > > > > > > when
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > producer is in fatal state. For
> > > CommitFailed,
> > > > > the
> > > > > > > > > > producer
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > in the abortable error state, so that
> > > > > > > > abortTransaction
> > > > > > > > > > call
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > throw.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Fri, Jan 22, 2021 at 11:02 AM
> Guozhang
> > > Wang
> > > > > <
> > > > > > > > > > > > > > wangg...@gmail.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> Or are you going to maintain some
> > internal
> > > > > state
> > > > > > > > such
> > > > > > > > > > > that,
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> `abortTransaction` in the catch block
> > > would
> > > > > > never
> > > > > > > > > throw
> > > > > > > > > > > > again?
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> On Fri, Jan 22, 2021 at 11:01 AM
> > Guozhang
> > > > > Wang <
> > > > > > > > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> > Hi Boyang/Jason,
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > I've also thought about this (i.e.
> > using
> > > > > > > > > CommitFailed
> > > > > > > > > > > for
> > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > >> non-fatal),
> > > > > > > > > > > > > > > > > > >> > but what I'm pondering is that, in
> the
> > > catch
> > > > > > > > > > > > (CommitFailed)
> > > > > > > > > > > > > > > block,
> > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > >> > would happen if the
> > > > > > > `producer.abortTransaction();`
> > > > > > > > > > > throws
> > > > > > > > > > > > > > again?
> > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > >> > that be captured as a fatal and
> cause
> > > the
> > > > > > client
> > > > > > > > to
> > > > > > > > > > > close
> > > > > > > > > > > > > > again.
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > If yes, then naively the pattern
> would
> > > be:
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > ...
> > > > > > > > > > > > > > > > > > >> > catch (CommitFailedException e) {
> > > > > > > > > > > > > > > > > > >> >         // Transaction commit failed
> > > with
> > > > > > > > abortable
> > > > > > > > > > > error,
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > >> reset
> > > > > > > > > > > > > > > > > > >> >         // the application state and
> > > resume
> > > > > > > with a
> > > > > > > > > new
> > > > > > > > > > > > > > > > transaction.
> > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > >> > root
> > > > > > > > > > > > > > > > > > >> >         // cause was wrapped in the
> > > thrown
> > > > > > > > > exception.
> > > > > > > > > > > > > > > > > > >> >
> > > > > >  resetToLastCommittedPositions(consumer);
> > > > > > > > > > > > > > > > > > >> >         try {
> > > > > > > > > > > > > > > > > > >> >
> >  producer.abortTransaction();
> > > > > > > > > > > > > > > > > > >> >         } catch (KafkaException e) {
> > > > > > > > > > > > > > > > > > >> >             producer.close();
> > > > > > > > > > > > > > > > > > >> >             consumer.close();
> > > > > > > > > > > > > > > > > > >> >             throw e;
> > > > > > > > > > > > > > > > > > >> >         }
> > > > > > > > > > > > > > > > > > >> >     } catch (KafkaException e) {
> > > > > > > > > > > > > > > > > > >> >         producer.close();
> > > > > > > > > > > > > > > > > > >> >         consumer.close();
> > > > > > > > > > > > > > > > > > >> >         throw e;
> > > > > > > > > > > > > > > > > > >> >     }
> > > > > > > > > > > > > > > > > > >> > ...
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > Guozhang
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > On Fri, Jan 22, 2021 at 10:47 AM
> > Boyang
> > > > > Chen <
> > > > > > > > > > > > > > > > > > >> reluctanthero...@gmail.com>
> > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> Hey Guozhang,
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> Jason and I were discussing the new
> > API
> > > > > > offline
> > > > > > > > and
> > > > > > > > > > > > decided
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > take
> > > > > > > > > > > > > > > > > > >> >> another
> > > > > > > > > > > > > > > > > > >> >> approach. Firstly, the reason not
> to
> > > > > invent a
> > > > > > > new
> > > > > > > > > API
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > > returned
> > > > > > > > > > > > > > > > > > >> >> boolean
> > > > > > > > > > > > > > > > > > >> >> flag is for compatibility
> > > consideration,
> > > > > > since
> > > > > > > > old
> > > > > > > > > > EOS
> > > > > > > > > > > > code
> > > > > > > > > > > > > > > > would not
> > > > > > > > > > > > > > > > > > >> know
> > > > > > > > > > > > > > > > > > >> >> that a given transaction commit was
> > > failed
> > > > > > > > > internally
> > > > > > > > > > > as
> > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > >> listen
> > > > > > > > > > > > > > > > > > >> >> to the output flag. Our proposed
> > > > > alternative
> > > > > > > > > solution
> > > > > > > > > > > is
> > > > > > > > > > > > to
> > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > >> >> *commitTransaction
> > > > > > > > > > > > > > > > > > >> >> throw CommitFailedException
> whenever
> > > the
> > > > > > commit
> > > > > > > > > > failed
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> exception*, so that on the caller
> > side
> > > the
> > > > > > > > handling
> > > > > > > > > > > logic
> > > > > > > > > > > > > > > > becomes:
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> try {
> > > > > > > > > > > > > > > > > > >> >>         if (shouldCommit) {
> > > > > > > > > > > > > > > > > > >> >>
> > >  producer.commitTransaction();
> > > > > > > > > > > > > > > > > > >> >>         } else {
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > >  resetToLastCommittedPositions(consumer);
> > > > > > > > > > > > > > > > > > >> >>
> > >  producer.abortTransaction();
> > > > > > > > > > > > > > > > > > >> >>         }
> > > > > > > > > > > > > > > > > > >> >>     } catch (CommitFailedException
> > e) {
> > > > > > > > > > > > > > > > > > >> >>         // Transaction commit
> failed
> > > with
> > > > > > > > abortable
> > > > > > > > > > > > error,
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > >> >> reset
> > > > > > > > > > > > > > > > > > >> >>         // the application state
> and
> > > resume
> > > > > > > with
> > > > > > > > a
> > > > > > > > > > new
> > > > > > > > > > > > > > > > transaction.
> > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > >> >> root
> > > > > > > > > > > > > > > > > > >> >>         // cause was wrapped in the
> > > thrown
> > > > > > > > > exception.
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > >  resetToLastCommittedPositions(consumer);
> > > > > > > > > > > > > > > > > > >> >>
>  producer.abortTransaction();
> > > > > > > > > > > > > > > > > > >> >>     } catch (KafkaException e) {
> > > > > > > > > > > > > > > > > > >> >>         producer.close();
> > > > > > > > > > > > > > > > > > >> >>         consumer.close();
> > > > > > > > > > > > > > > > > > >> >>         throw e;
> > > > > > > > > > > > > > > > > > >> >>     }
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> This approach looks cleaner as all
> > > > > exception
> > > > > > > > types
> > > > > > > > > > > other
> > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > >> CommitFailed
> > > > > > > > > > > > > > > > > > >> >> will doom to be fatal, which is
> very
> > > easy
> > > > > to
> > > > > > > > adopt
> > > > > > > > > > for
> > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> >> meantime, we still maintain the
> > > commitTxn
> > > > > > > > behavior
> > > > > > > > > to
> > > > > > > > > > > > throw
> > > > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> >> silently failing.
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> In addition, we decided to drop the
> > > > > > > > recommendation
> > > > > > > > > to
> > > > > > > > > > > > > handle
> > > > > > > > > > > > > > > > > > >> >> TimeoutException and leave it to
> the
> > > users
> > > > > to
> > > > > > > > make
> > > > > > > > > > the
> > > > > > > > > > > > > call.
> > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > >> downside
> > > > > > > > > > > > > > > > > > >> >> for blindly calling abortTxn upon
> > > timeout
> > > > > is
> > > > > > > that
> > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > > result in
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > >> >> illegal state when the commit was
> > > already
> > > > > > > > > successful
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > >> >> side. Without a good guarantee on
> the
> > > > > > outcome,
> > > > > > > > > > > > > > overcomplicating
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> >> template should not be encouraged
> > IMHO.
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> Let me know your thoughts on the
> new
> > > > > approach
> > > > > > > > here,
> > > > > > > > > > > thank
> > > > > > > > > > > > > > you!
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> Best,
> > > > > > > > > > > > > > > > > > >> >> Boyang
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> On Tue, Jan 19, 2021 at 11:11 AM
> > > Guozhang
> > > > > > Wang
> > > > > > > <
> > > > > > > > > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > > > > > > > >> >> wrote:
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > Thanks for your clarification on
> > > 2)/3),
> > > > > > that
> > > > > > > > > makes
> > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >> > On Tue, Jan 19, 2021 at 10:16 AM
> > > Boyang
> > > > > > Chen
> > > > > > > <
> > > > > > > > > > > > > > > > > > >> >> reluctanthero...@gmail.com>
> > > > > > > > > > > > > > > > > > >> >> > wrote:
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >> > > Thanks for the input Guozhang,
> > > replied
> > > > > > > > inline.
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > On Mon, Jan 18, 2021 at 8:57 PM
> > > > > Guozhang
> > > > > > > > Wang <
> > > > > > > > > > > > > > > > > > wangg...@gmail.com>
> > > > > > > > > > > > > > > > > > >> >> > wrote:
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > > Hello Boyang,
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > Thanks for the updated KIP. I
> > > read it
> > > > > > > again
> > > > > > > > > and
> > > > > > > > > > > > have
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> following
> > > > > > > > > > > > > > > > > > >> >> > > > thoughts:
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > 0. I'm a bit concerned that
> if
> > > > > > commitTxn
> > > > > > > > does
> > > > > > > > > > not
> > > > > > > > > > > > > throw
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > >> >> non-fatal
> > > > > > > > > > > > > > > > > > >> >> > > > exception, and instead we
> rely
> > > on the
> > > > > > > > > > subsequent
> > > > > > > > > > > > > > beginTxn
> > > > > > > > > > > > > > > > call
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> >> > throw,
> > > > > > > > > > > > > > > > > > >> >> > > it
> > > > > > > > > > > > > > > > > > >> >> > > > may violate some callers
> with a
> > > > > pattern
> > > > > > > > that
> > > > > > > > > > > > relying
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> commitTxn to
> > > > > > > > > > > > > > > > > > >> >> > > > succeed to make some
> > non-rollback
> > > > > > > > operations.
> > > > > > > > > > For
> > > > > > > > > > > > > > > example:
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > beginTxn()
> > > > > > > > > > > > > > > > > > >> >> > > > // do some read-write on my
> > > local DB
> > > > > > > > > > > > > > > > > > >> >> > > > commitTxn()
> > > > > > > > > > > > > > > > > > >> >> > > > // if commitTxn succeeds,
> then
> > > commit
> > > > > > the
> > > > > > > > DB
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > -------------
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > The issue is that, committing
> > DB
> > > is a
> > > > > > > > > > > non-rollback
> > > > > > > > > > > > > > > > operation,
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> >> users
> > > > > > > > > > > > > > > > > > >> >> > > may
> > > > > > > > > > > > > > > > > > >> >> > > > just rely on commitTxn to
> > return
> > > > > > without
> > > > > > > > > error
> > > > > > > > > > to
> > > > > > > > > > > > > make
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> >> > non-rollback
> > > > > > > > > > > > > > > > > > >> >> > > > call. Of course we can just
> > claim
> > > > > this
> > > > > > > > > pattern
> > > > > > > > > > is
> > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > legitimate
> > > > > > > > > > > > > > > > > > >> >> and is
> > > > > > > > > > > > > > > > > > >> >> > > not
> > > > > > > > > > > > > > > > > > >> >> > > > the right way of doing
> things,
> > > but
> > > > > many
> > > > > > > > users
> > > > > > > > > > may
> > > > > > > > > > > > > > > naturally
> > > > > > > > > > > > > > > > > > adopt
> > > > > > > > > > > > > > > > > > >> >> this
> > > > > > > > > > > > > > > > > > >> >> > > > pattern.
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > So maybe we should still let
> > > > > commitTxn
> > > > > > > also
> > > > > > > > > > throw
> > > > > > > > > > > > > > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> exceptions,
> > > > > > > > > > > > > > > > > > >> >> > > in
> > > > > > > > > > > > > > > > > > >> >> > > > which case we would then call
> > > > > abortTxn
> > > > > > > > again.
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > But if we do this, the
> pattern
> > > > > becomes:
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > > > > > > > >> >> > > >    beginTxn()
> > > > > > > > > > > > > > > > > > >> >> > > >    // do something
> > > > > > > > > > > > > > > > > > >> >> > > > } catch (Exception) {
> > > > > > > > > > > > > > > > > > >> >> > > >    shouldCommit = false;
> > > > > > > > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > if (shouldCommit) {
> > > > > > > > > > > > > > > > > > >> >> > > >     try {
> > > > > > > > > > > > > > > > > > >> >> > > >         commitTxn()
> > > > > > > > > > > > > > > > > > >> >> > > >     } catch (...) {        //
> > > > > enumerate
> > > > > > > all
> > > > > > > > > > fatal
> > > > > > > > > > > > > > > > exceptions
> > > > > > > > > > > > > > > > > > >> >> > > >         shutdown()
> > > > > > > > > > > > > > > > > > >> >> > > >     } catch (KafkaException)
> {
> > > > > > > > > > > > > > > > > > >> >> > > >         // non-fatal
> > > > > > > > > > > > > > > > > > >> >> > > >         shouldCommit = false;
> > > > > > > > > > > > > > > > > > >> >> > > >     }
> > > > > > > > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > if (!shouldCommit &&
> > running()) {
> > > > > > > > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > > > > > > > >> >> > > >         abortTxn()
> > > > > > > > > > > > > > > > > > >> >> > > >     } catch (KafkaException)
> {
> > > > > > > > > > > > > > > > > > >> >> > > >         // only throw fatal
> > > > > > > > > > > > > > > > > > >> >> > > >         shutdown()
> > > > > > > > > > > > > > > > > > >> >> > > >     }
> > > > > > > > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > ---------------------
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > Which is much more
> complicated.
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > Thank makes me think, the
> > > alternative
> > > > > > we
> > > > > > > > have
> > > > > > > > > > > > > discussed
> > > > > > > > > > > > > > > > offline
> > > > > > > > > > > > > > > > > > >> may
> > > > > > > > > > > > > > > > > > >> >> be
> > > > > > > > > > > > > > > > > > >> >> > > > better: let commitTxn() to
> > > return a
> > > > > > > boolean
> > > > > > > > > > flag.
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > * If it returns true, it
> means
> > > the
> > > > > > commit
> > > > > > > > > > > > succeeded.
> > > > > > > > > > > > > > > Users
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > >> >> > > comfortably
> > > > > > > > > > > > > > > > > > >> >> > > > continue and do any external
> > > > > > non-rollback
> > > > > > > > > > > > operations
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > >> like.
> > > > > > > > > > > > > > > > > > >> >> > > > * If it returns false, it
> means
> > > the
> > > > > > > commit
> > > > > > > > > > failed
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> error
> > > > > > > > > > > > > > > > > > >> >> > > *AND
> > > > > > > > > > > > > > > > > > >> >> > > > the txn has been aborted*.
> > Users
> > > do
> > > > > not
> > > > > > > > need
> > > > > > > > > to
> > > > > > > > > > > > call
> > > > > > > > > > > > > > > > abortTxn
> > > > > > > > > > > > > > > > > > >> again.
> > > > > > > > > > > > > > > > > > >> >> > > > * If it throws, then it means
> > > fatal
> > > > > > > errors.
> > > > > > > > > > Users
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > shut
> > > > > > > > > > > > > > > > > > >> down
> > > > > > > > > > > > > > > > > > >> >> the
> > > > > > > > > > > > > > > > > > >> >> > > > client.
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > In this case, the pattern
> > > becomes:
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > > > > > > > >> >> > > >    beginTxn()
> > > > > > > > > > > > > > > > > > >> >> > > >    // do something
> > > > > > > > > > > > > > > > > > >> >> > > > } catch (Exception) {
> > > > > > > > > > > > > > > > > > >> >> > > >    shouldCommit = false;
> > > > > > > > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > try {
> > > > > > > > > > > > > > > > > > >> >> > > >     if (shouldCommit) {
> > > > > > > > > > > > > > > > > > >> >> > > >         commitSucceeded =
> > > commitTxn()
> > > > > > > > > > > > > > > > > > >> >> > > >     } else {
> > > > > > > > > > > > > > > > > > >> >> > > >         // reset offsets,
> > > rollback
> > > > > > > > > operations,
> > > > > > > > > > > etc
> > > > > > > > > > > > > > > > > > >> >> > > >         abortTxn()
> > > > > > > > > > > > > > > > > > >> >> > > >     }
> > > > > > > > > > > > > > > > > > >> >> > > > } catch (KafkaException) {
> > > > > > > > > > > > > > > > > > >> >> > > >     // only throw fatal
> > > > > > > > > > > > > > > > > > >> >> > > >     shutdown()
> > > > > > > > > > > > > > > > > > >> >> > > > }
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > if (commitSucceeded)
> > > > > > > > > > > > > > > > > > >> >> > > >    // do other
> non-rollbackable
> > > > > things
> > > > > > > > > > > > > > > > > > >> >> > > > else
> > > > > > > > > > > > > > > > > > >> >> > > >    // reset offsets, rollback
> > > > > > operations,
> > > > > > > > etc
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > -------------------------
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > Of course, if we want to have
> > > better
> > > > > > > > > visibility
> > > > > > > > > > > > into
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > caused
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> > > commit
> > > > > > > > > > > > > > > > > > >> >> > > > to fail and txn to abort. We
> > can
> > > let
> > > > > > the
> > > > > > > > > return
> > > > > > > > > > > > type
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > enum
> > > > > > > > > > > > > > > > > > >> >> instead
> > > > > > > > > > > > > > > > > > >> >> > > of
> > > > > > > > > > > > > > > > > > >> >> > > > a flag. But my main idea is
> to
> > > still
> > > > > > let
> > > > > > > > the
> > > > > > > > > > > > > commitTxn
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> final
> > > > > > > > > > > > > > > > > > >> >> > point
> > > > > > > > > > > > > > > > > > >> >> > > > users can tell whether this
> txn
> > > > > > succeeded
> > > > > > > > or
> > > > > > > > > > not,
> > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> >> relying on
> > > > > > > > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > > > > > > > >> >> > > > next beginTxn() call.
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > I agree that adding a boolean
> > > flag is
> > > > > > > > indeed
> > > > > > > > > > > useful
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > case.
> > > > > > > > > > > > > > > > > > >> >> Will
> > > > > > > > > > > > > > > > > > >> >> > > update the KIP.
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > 1. Re: "while maintaining the
> > > behavior
> > > > > to
> > > > > > > > throw
> > > > > > > > > > > fatal
> > > > > > > > > > > > > > > > exception
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> >> raw"
> > > > > > > > > > > > > > > > > > >> >> > not
> > > > > > > > > > > > > > > > > > >> >> > > > sure what do you mean by
> > "throw"
> > > > > here.
> > > > > > > Are
> > > > > > > > > you
> > > > > > > > > > > > > > proposing
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> >> callback
> > > > > > > > > > > > > > > > > > >> >> > > would
> > > > > > > > > > > > > > > > > > >> >> > > > *pass* (not throw) in any
> fatal
> > > > > > > exceptions
> > > > > > > > > when
> > > > > > > > > > > > > > triggered
> > > > > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > > >> >> > > wrapping?
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > Yes, I want to say *pass*,
> the
> > > > > benefit
> > > > > > is
> > > > > > > > to
> > > > > > > > > > make
> > > > > > > > > > > > the
> > > > > > > > > > > > > > end
> > > > > > > > > > > > > > > > > > user's
> > > > > > > > > > > > > > > > > > >> >> > > expectation consistent
> > > > > > > > > > > > > > > > > > >> >> > > regarding exception handling.
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > > 2. I'm not sure I understand
> > the
> > > > > claim
> > > > > > > > > > regarding
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > callback
> > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > >> >> "In
> > > > > > > > > > > > > > > > > > >> >> > > EOS
> > > > > > > > > > > > > > > > > > >> >> > > > setup, it is not required to
> > > handle
> > > > > the
> > > > > > > > > > > exception".
> > > > > > > > > > > > > Are
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > >> >> proposing
> > > > > > > > > > > > > > > > > > >> >> > > that,
> > > > > > > > > > > > > > > > > > >> >> > > > e.g. in Streams, we do not
> try
> > to
> > > > > > handle
> > > > > > > > any
> > > > > > > > > > > > > exceptions
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > EOS
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> >> > enabled
> > > > > > > > > > > > > > > > > > >> >> > > > in the callback anymore, but
> > > just let
> > > > > > > > > > commitTxn()
> > > > > > > > > > > > > > itself
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > fail
> > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > >> >> be
> > > > > > > > > > > > > > > > > > >> >> > > > notified about the problem?
> > > > > Personally
> > > > > > I
> > > > > > > > > think
> > > > > > > > > > in
> > > > > > > > > > > > > > Streams
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > >> should
> > > > > > > > > > > > > > > > > > >> >> > just
> > > > > > > > > > > > > > > > > > >> >> > > > make the handling logic of
> the
> > > > > callback
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > > > > > consistent
> > > > > > > > > > > > > > > > > > >> regardless
> > > > > > > > > > > > > > > > > > >> >> of
> > > > > > > > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > > > > > > > >> >> > > > EOS settings (today we have
> > > different
> > > > > > > logic
> > > > > > > > > > > > depending
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> logic,
> > > > > > > > > > > > > > > > > > >> >> > > which
> > > > > > > > > > > > > > > > > > >> >> > > > I think could be unified as
> > > well).
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > My idea originates from the
> > > claim on
> > > > > > send
> > > > > > > > > API:
> > > > > > > > > > > > > > > > > > >> >> > > "When used as part of a
> > > transaction, it
> > > > > > is
> > > > > > > > not
> > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > define
> > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > >> >> > > callback or check the result of
> > the
> > > > > > future
> > > > > > > > in
> > > > > > > > > > > order
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > detect
> > > > > > > > > > > > > > > > > > >> errors
> > > > > > > > > > > > > > > > > > >> >> > from
> > > > > > > > > > > > > > > > > > >> >> > > <code>send</code>. "
> > > > > > > > > > > > > > > > > > >> >> > > My understanding is that for
> EOS,
> > > the
> > > > > > > > exception
> > > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > detected
> > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > >> >> > calling
> > > > > > > > > > > > > > > > > > >> >> > > transactional APIs either way,
> so
> > > it
> > > > > is a
> > > > > > > > > > duplicate
> > > > > > > > > > > > > > > handling
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> track
> > > > > > > > > > > > > > > > > > >> >> > > all the sendExceptions in
> > > > > > RecordCollector.
> > > > > > > > > > > However, I
> > > > > > > > > > > > > > > looked
> > > > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > > > >> >> > > sendException is being used
> today
> > > as
> > > > > > > follow:
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > 1. Pass to
> > > "ProductionExceptionHandler"
> > > > > > for
> > > > > > > > any
> > > > > > > > > > > > default
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > >> customized
> > > > > > > > > > > > > > > > > > >> >> > > exception handler to handle
> > > > > > > > > > > > > > > > > > >> >> > > 2. Stop collecting offset info
> or
> > > new
> > > > > > > > > exceptions
> > > > > > > > > > > > > > > > > > >> >> > > 3. Check and rethrow exceptions
> > in
> > > > > > close(),
> > > > > > > > > > flush()
> > > > > > > > > > > > or
> > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > send()
> > > > > > > > > > > > > > > > > > >> >> calls
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > To my understanding, we should
> > > still
> > > > > > honor
> > > > > > > > the
> > > > > > > > > > > > > commitment
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > #1
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> >> any
> > > > > > > > > > > > > > > > > > >> >> > > user customized implementation.
> > > The #2
> > > > > > does
> > > > > > > > not
> > > > > > > > > > > > really
> > > > > > > > > > > > > > > > affect our
> > > > > > > > > > > > > > > > > > >> >> > decision
> > > > > > > > > > > > > > > > > > >> >> > > upon EOS. The #3 here is still
> > > valuable
> > > > > > as
> > > > > > > it
> > > > > > > > > > could
> > > > > > > > > > > > > help
> > > > > > > > > > > > > > us
> > > > > > > > > > > > > > > > fail
> > > > > > > > > > > > > > > > > > >> fast
> > > > > > > > > > > > > > > > > > >> >> in
> > > > > > > > > > > > > > > > > > >> >> > > new send() instead of waiting
> to
> > > later
> > > > > > > stage
> > > > > > > > of
> > > > > > > > > > > > > > processing.
> > > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> >> > sense,
> > > > > > > > > > > > > > > > > > >> >> > > I agree we should continue to
> > > record
> > > > > send
> > > > > > > > > > > exceptions
> > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > EOS
> > > > > > > > > > > > > > > > > > >> >> case
> > > > > > > > > > > > > > > > > > >> >> > to
> > > > > > > > > > > > > > > > > > >> >> > > ensure the strength of stream
> > side
> > > > > > Producer
> > > > > > > > > > logic.
> > > > > > > > > > > On
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > safer
> > > > > > > > > > > > > > > > > > >> side,
> > > > > > > > > > > > > > > > > > >> >> we
> > > > > > > > > > > > > > > > > > >> >> > no
> > > > > > > > > > > > > > > > > > >> >> > > longer need to wrap certain
> fatal
> > > > > > > exceptions
> > > > > > > > > like
> > > > > > > > > > > > > > > > ProducerFenced
> > > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > >> >> > > TaskMigrated, since they should
> > not
> > > > > crash
> > > > > > > the
> > > > > > > > > > > stream
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > >> thrown
> > > > > > > > > > > > > > > > > > >> >> in
> > > > > > > > > > > > > > > > > > >> >> > > raw format, once we adopt the
> new
> > > > > > > processing
> > > > > > > > > > model
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > send
> > > > > > > > > > > > > > > > > > >> phase.
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > Guozhang
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > On Thu, Dec 17, 2020 at 8:42
> PM
> > > > > Boyang
> > > > > > > > Chen <
> > > > > > > > > > > > > > > > > > >> >> > reluctanthero...@gmail.com>
> > > > > > > > > > > > > > > > > > >> >> > > > wrote:
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > Thanks for everyone's
> > feedback
> > > so
> > > > > > far.
> > > > > > > I
> > > > > > > > > have
> > > > > > > > > > > > > > polished
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > >> >> after
> > > > > > > > > > > > > > > > > > >> >> > > > > offline discussion with
> some
> > > folks
> > > > > > > > working
> > > > > > > > > on
> > > > > > > > > > > EOS
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > make the
> > > > > > > > > > > > > > > > > > >> >> > exception
> > > > > > > > > > > > > > > > > > >> >> > > > > handling more lightweight.
> > The
> > > > > > > essential
> > > > > > > > > > change
> > > > > > > > > > > > is
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > > > >> >> > > > > inventing a new
> intermediate
> > > > > > exception
> > > > > > > > > type,
> > > > > > > > > > > but
> > > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > > > >> >> separating a
> > > > > > > > > > > > > > > > > > >> >> > > > full
> > > > > > > > > > > > > > > > > > >> >> > > > > transaction into two
> phases:
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > 1. The data transmission
> > phase
> > > > > > > > > > > > > > > > > > >> >> > > > > 2. The commit phase
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > This way for any exception
> > > thrown
> > > > > > from
> > > > > > > > > phase
> > > > > > > > > > 1,
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > >> >> indicator
> > > > > > > > > > > > > > > > > > >> >> > in
> > > > > > > > > > > > > > > > > > >> >> > > > > phase 2 whether we should
> do
> > > commit
> > > > > > or
> > > > > > > > > abort,
> > > > > > > > > > > and
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > now on
> > > > > > > > > > > > > > > > > > >> >> > > > > `commitTransaction` should
> > only
> > > > > throw
> > > > > > > > fatal
> > > > > > > > > > > > > > exceptions,
> > > > > > > > > > > > > > > > > > >> similar to
> > > > > > > > > > > > > > > > > > >> >> > > > > `abortTransaction`, so that
> > any
> > > > > > > > > > KafkaException
> > > > > > > > > > > > > caught
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> >> commit
> > > > > > > > > > > > > > > > > > >> >> > > phase
> > > > > > > > > > > > > > > > > > >> >> > > > > will be definitely fatal to
> > > crash
> > > > > the
> > > > > > > > app.
> > > > > > > > > > For
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > advanced
> > > > > > > > > > > > > > > > > > >> users
> > > > > > > > > > > > > > > > > > >> >> > such
> > > > > > > > > > > > > > > > > > >> >> > > > as
> > > > > > > > > > > > > > > > > > >> >> > > > > Streams, we have the
> ability
> > to
> > > > > > further
> > > > > > > > > wrap
> > > > > > > > > > > > > selected
> > > > > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> >> fatal
> > > > > > > > > > > > > > > > > > >> >> > > > > exceptions to trigger task
> > > > > migration
> > > > > > if
> > > > > > > > > > > > necessary.
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > More details in the KIP,
> feel
> > > free
> > > > > to
> > > > > > > > take
> > > > > > > > > > > > another
> > > > > > > > > > > > > > > look,
> > > > > > > > > > > > > > > > > > >> thanks!
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > On Thu, Dec 17, 2020 at
> 8:36
> > PM
> > > > > > Boyang
> > > > > > > > > Chen <
> > > > > > > > > > > > > > > > > > >> >> > > reluctanthero...@gmail.com>
> > > > > > > > > > > > > > > > > > >> >> > > > > wrote:
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > > Thanks Bruno for the
> > > feedback.
> > > > > > > > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > > On Mon, Dec 7, 2020 at
> 5:26
> > > AM
> > > > > > Bruno
> > > > > > > > > > Cadonna
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> >> br...@confluent.io>
> > > > > > > > > > > > > > > > > > >> >> > > > wrote:
> > > > > > > > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> Thanks Boyang for the
> KIP!
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> Like Matthias, I do also
> > not
> > > > > know
> > > > > > > the
> > > > > > > > > > > producer
> > > > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > >> >> > enough
> > > > > > > > > > > > > > > > > > >> >> > > to
> > > > > > > > > > > > > > > > > > >> >> > > > > >> comment on the
> > > categorization.
> > > > > > > > However,
> > > > > > > > > I
> > > > > > > > > > > > think
> > > > > > > > > > > > > > > > having a
> > > > > > > > > > > > > > > > > > >> super
> > > > > > > > > > > > > > > > > > >> >> > > > exception
> > > > > > > > > > > > > > > > > > >> >> > > > > >> (e.g.
> RetriableException)
> > > that
> > > > > > > > encodes
> > > > > > > > > if
> > > > > > > > > > > an
> > > > > > > > > > > > > > > > exception is
> > > > > > > > > > > > > > > > > > >> >> fatal
> > > > > > > > > > > > > > > > > > >> >> > or
> > > > > > > > > > > > > > > > > > >> >> > > > not
> > > > > > > > > > > > > > > > > > >> >> > > > > >> is cleaner, better
> > > > > understandable
> > > > > > > and
> > > > > > > > > less
> > > > > > > > > > > > > > > > error-prone,
> > > > > > > > > > > > > > > > > > >> because
> > > > > > > > > > > > > > > > > > >> >> > > > ideally
> > > > > > > > > > > > > > > > > > >> >> > > > > >> when you add a new
> > non-fatal
> > > > > > > exception
> > > > > > > > > in
> > > > > > > > > > > > future
> > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > >> need
> > > > > > > > > > > > > > > > > > >> >> to
> > > > > > > > > > > > > > > > > > >> >> > > > think
> > > > > > > > > > > > > > > > > > >> >> > > > > >> about letting it inherit
> > > from
> > > > > the
> > > > > > > > super
> > > > > > > > > > > > > exception
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> rest
> > > > > > > > > > > > > > > > > > >> >> > > of
> > > > > > > > > > > > > > > > > > >> >> > > > > >> the code will just
> behave
> > > > > > correctly
> > > > > > > > > > without
> > > > > > > > > > > > the
> > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > wrap
> > > > > > > > > > > > > > > > > > >> >> the
> > > > > > > > > > > > > > > > > > >> >> > new
> > > > > > > > > > > > > > > > > > >> >> > > > > >> exception into another
> > > exception
> > > > > > > each
> > > > > > > > > time
> > > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > thrown
> > > > > > > > > > > > > > > > > > >> (maybe
> > > > > > > > > > > > > > > > > > >> >> it
> > > > > > > > > > > > > > > > > > >> >> > is
> > > > > > > > > > > > > > > > > > >> >> > > > > >> thrown at different
> > > location in
> > > > > > the
> > > > > > > > > code).
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> As far as I understand
> the
> > > > > > following
> > > > > > > > > > > statement
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > your
> > > > > > > > > > > > > > > > > > >> >> previous
> > > > > > > > > > > > > > > > > > >> >> > > > e-mail
> > > > > > > > > > > > > > > > > > >> >> > > > > >> is the reason that
> > currently
> > > > > such
> > > > > > a
> > > > > > > > > super
> > > > > > > > > > > > > > exception
> > > > > > > > > > > > > > > > is not
> > > > > > > > > > > > > > > > > > >> >> > possible:
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> "Right now we have
> > > > > > > RetriableException
> > > > > > > > > > type,
> > > > > > > > > > > if
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > >> >> > add a
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > `ProducerRetriableException`
> > > > > type,
> > > > > > > we
> > > > > > > > > have
> > > > > > > > > > > to
> > > > > > > > > > > > > put
> > > > > > > > > > > > > > > > this new
> > > > > > > > > > > > > > > > > > >> >> > interface
> > > > > > > > > > > > > > > > > > >> >> > > > as
> > > > > > > > > > > > > > > > > > >> >> > > > > >> the parent of the
> > > > > > > RetriableException,
> > > > > > > > > > > because
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > thrown
> > > > > > > > > > > > > > > > > > >> >> > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> > > > > >> exceptions are
> `retriable`
> > > in
> > > > > > > general
> > > > > > > > > for
> > > > > > > > > > > > > > producer"
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> In the list of
> exceptions
> > in
> > > > > your
> > > > > > > > KIP, I
> > > > > > > > > > > found
> > > > > > > > > > > > > > > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> > exceptions
> > > > > > > > > > > > > > > > > > >> >> > > > that
> > > > > > > > > > > > > > > > > > >> >> > > > > >> do not inherit from
> > > > > > > > RetriableException.
> > > > > > > > > I
> > > > > > > > > > > > guess
> > > > > > > > > > > > > > > those
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> ones
> > > > > > > > > > > > > > > > > > >> >> > > you
> > > > > > > > > > > > > > > > > > >> >> > > > > >> are referring to in your
> > > > > > statement:
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > InvalidProducerEpochException
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> InvalidPidMappingException
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > TransactionAbortedException
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> All of those exceptions
> > are
> > > > > > > non-fatal
> > > > > > > > > and
> > > > > > > > > > do
> > > > > > > > > > > > not
> > > > > > > > > > > > > > > > inherit
> > > > > > > > > > > > > > > > > > >> from
> > > > > > > > > > > > > > > > > > >> >> > > > > >> RetriableException. Is
> > > there a
> > > > > > > reason
> > > > > > > > > for
> > > > > > > > > > > > that?
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > >> >> depended
> > > > > > > > > > > > > > > > > > >> >> > > from
> > > > > > > > > > > > > > > > > > >> >> > > > > >> RetriableException we
> > would
> > > be a
> > > > > > bit
> > > > > > > > > > closer
> > > > > > > > > > > > to a
> > > > > > > > > > > > > > > super
> > > > > > > > > > > > > > > > > > >> >> exception I
> > > > > > > > > > > > > > > > > > >> >> > > > > >> mention above.
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> The reason is that
> sender
> > > may
> > > > > > catch
> > > > > > > > > those
> > > > > > > > > > > > > > exceptions
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> >> > > > > > ProduceResponse, and it
> > > currently
> > > > > > > does
> > > > > > > > > > > infinite
> > > > > > > > > > > > > > > > > > >> >> > > > > > retries on
> > > RetriableException. To
> > > > > > > make
> > > > > > > > > sure
> > > > > > > > > > > we
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > trigger
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> > > > > > abortTransaction() by
> > > catching
> > > > > > > > non-fatal
> > > > > > > > > > > thrown
> > > > > > > > > > > > > > > > > > >> >> > > > > > exceptions and
> reinitialize
> > > the
> > > > > txn
> > > > > > > > > state,
> > > > > > > > > > we
> > > > > > > > > > > > > chose
> > > > > > > > > > > > > > > > not to
> > > > > > > > > > > > > > > > > > >> let
> > > > > > > > > > > > > > > > > > >> >> > those
> > > > > > > > > > > > > > > > > > >> >> > > > > > exceptions inherit
> > > > > > RetriableException
> > > > > > > > so
> > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> >> > > > > > they won't cause infinite
> > > retry
> > > > > on
> > > > > > > > > sender.
> > > > > > > > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> With
> > > OutOfOrderSequenceException
> > > > > > and
> > > > > > > > > > > > > > > > > > >> >> UnknownProducerIdException, I
> > > > > > > > > > > > > > > > > > >> >> > > > think
> > > > > > > > > > > > > > > > > > >> >> > > > > >> to understand that their
> > > > > fatality
> > > > > > > > > depends
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > (i.e.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> configuration) of the
> > > producer.
> > > > > > That
> > > > > > > > > makes
> > > > > > > > > > > it
> > > > > > > > > > > > > > > > difficult to
> > > > > > > > > > > > > > > > > > >> >> have a
> > > > > > > > > > > > > > > > > > >> >> > > > super
> > > > > > > > > > > > > > > > > > >> >> > > > > >> exception that encodes
> the
> > > > > > > > retriability
> > > > > > > > > as
> > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > > > >> >> Would
> > > > > > > > > > > > > > > > > > >> >> > it
> > > > > > > > > > > > > > > > > > >> >> > > > be
> > > > > > > > > > > > > > > > > > >> >> > > > > >> possible to introduce
> > > exceptions
> > > > > > > that
> > > > > > > > > > > inherit
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > >> >> > > RetriableException
> > > > > > > > > > > > > > > > > > >> >> > > > > >> and that are thrown when
> > > those
> > > > > > > > > exceptions
> > > > > > > > > > > are
> > > > > > > > > > > > > > caught
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> > > brokers
> > > > > > > > > > > > > > > > > > >> >> > > > > >> and the type of the
> > > producer is
> > > > > > such
> > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > exceptions
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> >> > > > retriable?
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> Yea, I think in general
> > the
> > > > > > > exception
> > > > > > > > > type
> > > > > > > > > > > > > mixing
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> difficult
> > > > > > > > > > > > > > > > > > >> >> to
> > > > > > > > > > > > > > > > > > >> >> > > get
> > > > > > > > > > > > > > > > > > >> >> > > > > > them all right. I have
> > > already
> > > > > > > proposed
> > > > > > > > > > > another
> > > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > >> based
> > > > > > > > > > > > > > > > > > >> >> on
> > > > > > > > > > > > > > > > > > >> >> > my
> > > > > > > > > > > > > > > > > > >> >> > > > > > offline discussion with
> > some
> > > > > folks
> > > > > > > > > working
> > > > > > > > > > on
> > > > > > > > > > > > EOS
> > > > > > > > > > > > > > > > > > >> >> > > > > > to make the handling more
> > > > > > > > straightforward
> > > > > > > > > > for
> > > > > > > > > > > > end
> > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > >> without
> > > > > > > > > > > > > > > > > > >> >> the
> > > > > > > > > > > > > > > > > > >> >> > > > need
> > > > > > > > > > > > > > > > > > >> >> > > > > > to distinguish exception
> > > > > fatality.
> > > > > > > > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> Best,
> > > > > > > > > > > > > > > > > > >> >> > > > > >> Bruno
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> On 04.12.20 19:34,
> > Guozhang
> > > Wang
> > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > Thanks Boyang for the
> > > > > proposal!
> > > > > > I
> > > > > > > > > made a
> > > > > > > > > > > > pass
> > > > > > > > > > > > > > over
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> list
> > > > > > > > > > > > > > > > > > >> >> and
> > > > > > > > > > > > > > > > > > >> >> > > here
> > > > > > > > > > > > > > > > > > >> >> > > > > are
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > some thoughts:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > 0) Although this is
> not
> > > part
> > > > > of
> > > > > > > the
> > > > > > > > > > public
> > > > > > > > > > > > > API,
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > >> >> > should
> > > > > > > > > > > > > > > > > > >> >> > > > make
> > > > > > > > > > > > > > > > > > >> >> > > > > >> sure
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > that the suggested
> > pattern
> > > > > (i.e.
> > > > > > > > user
> > > > > > > > > > can
> > > > > > > > > > > > > always
> > > > > > > > > > > > > > > > call
> > > > > > > > > > > > > > > > > > >> >> abortTxn()
> > > > > > > > > > > > > > > > > > >> >> > > > when
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > handling non-fatal
> > > errors) are
> > > > > > > > indeed
> > > > > > > > > > > > > supported.
> > > > > > > > > > > > > > > > E.g. if
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> txn
> > > > > > > > > > > > > > > > > > >> >> > > is
> > > > > > > > > > > > > > > > > > >> >> > > > > >> already
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > aborted by the broker
> > > side,
> > > > > then
> > > > > > > > users
> > > > > > > > > > can
> > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > call
> > > > > > > > > > > > > > > > > > >> >> abortTxn
> > > > > > > > > > > > > > > > > > >> >> > > which
> > > > > > > > > > > > > > > > > > >> >> > > > > >> would
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > not throw another
> > > exception
> > > > > but
> > > > > > > just
> > > > > > > > > be
> > > > > > > > > > > > > treated
> > > > > > > > > > > > > > > as a
> > > > > > > > > > > > > > > > > > >> no-op.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > 1)
> > > > > > > > *ConcurrentTransactionsException*:
> > > > > > > > > I
> > > > > > > > > > > > think
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > error
> > > > > > > > > > > > > > > > > > >> can
> > > > > > > > > > > > > > > > > > >> >> > also
> > > > > > > > > > > > > > > > > > >> >> > > be
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > returned but not
> > > documented
> > > > > yet.
> > > > > > > > This
> > > > > > > > > > > should
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > >> non-fatal
> > > > > > > > > > > > > > > > > > >> >> > error.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > 2)
> > > *InvalidTxnStateException*:
> > > > > > > this
> > > > > > > > > > error
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > returned
> > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > >> >> > broker
> > > > > > > > > > > > > > > > > > >> >> > > > when
> > > > > > > > > > > > > > > > > > >> >> > > > > >> txn
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > state transition
> failed
> > > (e.g.
> > > > > it
> > > > > > > is
> > > > > > > > > > trying
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > transit to
> > > > > > > > > > > > > > > > > > >> >> > > > > complete-commit
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > while the current
> state
> > > is not
> > > > > > > > > > > > > prepare-commit).
> > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > error
> > > > > > > > > > > > > > > > > > >> >> could
> > > > > > > > > > > > > > > > > > >> >> > > > > >> indicates
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > a bug on the client
> > > internal
> > > > > > code
> > > > > > > or
> > > > > > > > > the
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > code,
> > > > > > > > > > > > > > > > > > OR a
> > > > > > > > > > > > > > > > > > >> >> user
> > > > > > > > > > > > > > > > > > >> >> > > > error
> > > > > > > > > > > > > > > > > > >> >> > > > > >> --- a
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > similar error is
> > > > > > > > > > > > > > ConcurrentTransactionsException,
> > > > > > > > > > > > > > > > i.e.
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > >> >> Kafka
> > > > > > > > > > > > > > > > > > >> >> > is
> > > > > > > > > > > > > > > > > > >> >> > > > > >> bug-free
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > these exceptions would
> > > only be
> > > > > > > > > returned
> > > > > > > > > > if
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > try to
> > > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > >> >> > > something
> > > > > > > > > > > > > > > > > > >> >> > > > > >> wrong,
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > e.g. calling abortTxn
> > > right
> > > > > > after
> > > > > > > a
> > > > > > > > > > > > commitTxn,
> > > > > > > > > > > > > > > etc.
> > > > > > > > > > > > > > > > So
> > > > > > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > >> >> > > thinking
> > > > > > > > > > > > > > > > > > >> >> > > > it
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > should be a non-fatal
> > > error
> > > > > > > instead
> > > > > > > > > of a
> > > > > > > > > > > > fatal
> > > > > > > > > > > > > > > > error,
> > > > > > > > > > > > > > > > > > >> wdyt?
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > 3) *KafkaException*:
> > case
> > > i
> > > > > > > > "indicates
> > > > > > > > > > > fatal
> > > > > > > > > > > > > > > > > > transactional
> > > > > > > > > > > > > > > > > > >> >> > > sequence
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > (Fatal)", this is a
> bit
> > > > > > > conflicting
> > > > > > > > > with
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> >> > > > *OutOfSequenceException*
> > > > > > > > > > > > > > > > > > >> >> > > > > >> that
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > is treated as
> > non-fatal. I
> > > > > guess
> > > > > > > > your
> > > > > > > > > > > > proposal
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > OutOfOrderSequenceException
> > > > > > would
> > > > > > > be
> > > > > > > > > > > treated
> > > > > > > > > > > > > > > either
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > >> fatal
> > > > > > > > > > > > > > > > > > >> >> > with
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > transactional
> producer,
> > or
> > > > > > > non-fatal
> > > > > > > > > > with
> > > > > > > > > > > > > > > idempotent
> > > > > > > > > > > > > > > > > > >> >> producer,
> > > > > > > > > > > > > > > > > > >> >> > is
> > > > > > > > > > > > > > > > > > >> >> > > > that
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > right? If the producer
> > is
> > > only
> > > > > > > > > > configured
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > idempotency
> > > > > > > > > > > > > > > > > > >> >> but
> > > > > > > > > > > > > > > > > > >> >> > not
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > transaction, then
> > > throwing a
> > > > > > > > > > > > > > > > > > >> >> TransactionStateCorruptedException
> > > > > > > > > > > > > > > > > > >> >> > > for
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > non-fatal errors would
> > be
> > > > > > > confusing
> > > > > > > > > > since
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > >> using
> > > > > > > > > > > > > > > > > > >> >> > > > > >> transactions
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > at all.. So I suggest
> we
> > > > > always
> > > > > > > > throw
> > > > > > > > > > > > > > > > > > >> OutOfSequenceException
> > > > > > > > > > > > > > > > > > >> >> > as-is
> > > > > > > > > > > > > > > > > > >> >> > > > > (i.e.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > not wrapped) no matter
> > > how the
> > > > > > > > > producer
> > > > > > > > > > is
> > > > > > > > > > > > > > > > configured,
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> >> let
> > > > > > > > > > > > > > > > > > >> >> > the
> > > > > > > > > > > > > > > > > > >> >> > > > > >> caller
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > decide how to handle
> it
> > > based
> > > > > on
> > > > > > > > > whether
> > > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > >> >> idempotent
> > > > > > > > > > > > > > > > > > >> >> > or
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > transactional itself.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > 4) Besides all the txn
> > > APIs,
> > > > > the
> > > > > > > > > > `send()`
> > > > > > > > > > > > > > > callback /
> > > > > > > > > > > > > > > > > > >> future
> > > > > > > > > > > > > > > > > > >> >> can
> > > > > > > > > > > > > > > > > > >> >> > > also
> > > > > > > > > > > > > > > > > > >> >> > > > > >> throw
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > txn-related
> exceptions,
> > I
> > > > > think
> > > > > > > this
> > > > > > > > > KIP
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > >> >> this
> > > > > > > > > > > > > > > > > > >> >> > > API
> > > > > > > > > > > > > > > > > > >> >> > > > as
> > > > > > > > > > > > > > > > > > >> >> > > > > >> well?
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > 5) This is related to
> > 1/2)
> > > > > > above:
> > > > > > > > > > > sometimes
> > > > > > > > > > > > > > those
> > > > > > > > > > > > > > > > > > >> non-fatal
> > > > > > > > > > > > > > > > > > >> >> > errors
> > > > > > > > > > > > > > > > > > >> >> > > > > like
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > ConcurrentTxn or
> > > > > InvalidTxnState
> > > > > > > are
> > > > > > > > > not
> > > > > > > > > > > due
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > state
> > > > > > > > > > > > > > > > > > >> >> being
> > > > > > > > > > > > > > > > > > >> >> > > > > >> corrupted
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > at the broker side,
> but
> > > maybe
> > > > > > > users
> > > > > > > > > are
> > > > > > > > > > > > doing
> > > > > > > > > > > > > > > > something
> > > > > > > > > > > > > > > > > > >> >> wrong.
> > > > > > > > > > > > > > > > > > >> >> > So
> > > > > > > > > > > > > > > > > > >> >> > > > I'm
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > wondering if we should
> > > further
> > > > > > > > > > distinguish
> > > > > > > > > > > > > those
> > > > > > > > > > > > > > > > > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> > errors
> > > > > > > > > > > > > > > > > > >> >> > > > > >> between
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > a) those that are
> caused
> > > by
> > > > > > Kafka
> > > > > > > > > > itself,
> > > > > > > > > > > > > e.g. a
> > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > >> timed
> > > > > > > > > > > > > > > > > > >> >> > out
> > > > > > > > > > > > > > > > > > >> >> > > > and
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > aborted a txn and
> later
> > an
> > > > > > endTxn
> > > > > > > > > > request
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > received,
> > > > > > > > > > > > > > > > > > >> and b)
> > > > > > > > > > > > > > > > > > >> >> > the
> > > > > > > > > > > > > > > > > > >> >> > > > > user's
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > API call pattern is
> > > incorrect,
> > > > > > > > causing
> > > > > > > > > > the
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > to be
> > > > > > > > > > > > > > > > > > >> >> > rejected
> > > > > > > > > > > > > > > > > > >> >> > > > with
> > > > > > > > > > > > > > > > > > >> >> > > > > >> an
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > error code from the
> > > broker.
> > > > > > > > > > > > > > > > > > >> >>
> *TransactionStateCorruptedException*
> > > > > > > > > > > > > > > > > > >> >> > > > feels
> > > > > > > > > > > > > > > > > > >> >> > > > > >> to
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > me more like for case
> > a),
> > > but
> > > > > > not
> > > > > > > > case
> > > > > > > > > > b).
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > Guozhang
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > On Wed, Dec 2, 2020 at
> > > 4:50 PM
> > > > > > > > Boyang
> > > > > > > > > > > Chen <
> > > > > > > > > > > > > > > > > > >> >> > > > > reluctanthero...@gmail.com
> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> Thanks Matthias, I
> > think
> > > your
> > > > > > > > > proposal
> > > > > > > > > > > > makes
> > > > > > > > > > > > > > > sense
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > >> well,
> > > > > > > > > > > > > > > > > > >> >> on
> > > > > > > > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > > > > > > > >> >> > > > > pro
> > > > > > > > > > > > > > > > > > >> >> > > > > >> side
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> we could have a
> > > universally
> > > > > > > agreed
> > > > > > > > > > > > exception
> > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > to be
> > > > > > > > > > > > > > > > > > >> >> caught
> > > > > > > > > > > > > > > > > > >> >> > by
> > > > > > > > > > > > > > > > > > >> >> > > > the
> > > > > > > > > > > > > > > > > > >> >> > > > > >> user,
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> without having an
> extra
> > > layer
> > > > > > on
> > > > > > > > top
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > > actual
> > > > > > > > > > > > > > > > > > >> >> exceptions.
> > > > > > > > > > > > > > > > > > >> >> > I
> > > > > > > > > > > > > > > > > > >> >> > > > > could
> > > > > > > > > > > > > > > > > > >> >> > > > > >> see
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> some issue on
> > downsides:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> 1. The exception
> > > hierarchy
> > > > > will
> > > > > > > be
> > > > > > > > > more
> > > > > > > > > > > > > > complex.
> > > > > > > > > > > > > > > > Right
> > > > > > > > > > > > > > > > > > >> now
> > > > > > > > > > > > > > > > > > >> >> we
> > > > > > > > > > > > > > > > > > >> >> > > have
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> RetriableException
> > type,
> > > if
> > > > > we
> > > > > > > are
> > > > > > > > > > going
> > > > > > > > > > > to
> > > > > > > > > > > > > > add a
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > `ProducerRetriableException`
> > > > > > > type,
> > > > > > > > we
> > > > > > > > > > > have
> > > > > > > > > > > > to
> > > > > > > > > > > > > > put
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > > > >> >> > > > interface
> > > > > > > > > > > > > > > > > > >> >> > > > > >> as the
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> parent of the
> > > > > > RetriableException,
> > > > > > > > > > because
> > > > > > > > > > > > not
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > thrown
> > > > > > > > > > > > > > > > > > >> >> > > non-fatal
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> exceptions are
> > > `retriable` in
> > > > > > > > general
> > > > > > > > > > for
> > > > > > > > > > > > > > > > producer, for
> > > > > > > > > > > > > > > > > > >> >> example
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> <
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://github.com/apache/kafka/blob/e275742f850af4a1b79b0d1bd1ac9a1d2e89c64e/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L745
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> .
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> We could have an
> empty
> > > > > > interface
> > > > > > > > > > > > > > > > > > >> >> `ProducerRetriableException`
> > > > > > > > > > > > > > > > > > >> >> > to
> > > > > > > > > > > > > > > > > > >> >> > > > let
> > > > > > > > > > > > > > > > > > >> >> > > > > >> all
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> the thrown exceptions
> > > > > implement
> > > > > > > for
> > > > > > > > > > sure,
> > > > > > > > > > > > but
> > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > > > > > >> >> > > unorthodox
> > > > > > > > > > > > > > > > > > >> >> > > > > in
> > > > > > > > > > > > > > > > > > >> >> > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> way we deal with
> > > exceptions
> > > > > in
> > > > > > > > > general.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> 2. There are cases
> > where
> > > we
> > > > > > > throw a
> > > > > > > > > > > > > > > KafkaException
> > > > > > > > > > > > > > > > > > >> wrapping
> > > > > > > > > > > > > > > > > > >> >> > > another
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> KafkaException as
> > either
> > > > > fatal
> > > > > > or
> > > > > > > > > > > > non-fatal.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > >> >> > > interface
> > > > > > > > > > > > > > > > > > >> >> > > > > to
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> solve #1, it is also
> > > required
> > > > > > to
> > > > > > > > > > > implement
> > > > > > > > > > > > > > > another
> > > > > > > > > > > > > > > > > > >> bloated
> > > > > > > > > > > > > > > > > > >> >> > > > exception
> > > > > > > > > > > > > > > > > > >> >> > > > > >> class
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> which could replace
> > > > > > > KafkaException
> > > > > > > > > > type,
> > > > > > > > > > > as
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > couldn't
> > > > > > > > > > > > > > > > > > >> mark
> > > > > > > > > > > > > > > > > > >> >> > > > > >> KafkaException
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> as retriable for
> sure.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> 3. In terms of the
> > > > > > encapsulation,
> > > > > > > > > > > wrapping
> > > > > > > > > > > > > > means
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > >> >> limit
> > > > > > > > > > > > > > > > > > >> >> > > the
> > > > > > > > > > > > > > > > > > >> >> > > > > >> scope
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> of affection to the
> > > producer
> > > > > > > only,
> > > > > > > > > > which
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > > > >> since
> > > > > > > > > > > > > > > > > > >> >> we
> > > > > > > > > > > > > > > > > > >> >> > > > don't
> > > > > > > > > > > > > > > > > > >> >> > > > > >> want
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> shared exception
> types
> > to
> > > > > > > > implement a
> > > > > > > > > > > > > > > > producer-related
> > > > > > > > > > > > > > > > > > >> >> > interface,
> > > > > > > > > > > > > > > > > > >> >> > > > > such
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> as
> > > > > > > > UnknownTopicOrPartitionException.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> Best,
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> Boyang
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >> On Wed, Dec 2, 2020
> at
> > > 3:38
> > > > > PM
> > > > > > > > > Matthias
> > > > > > > > > > > J.
> > > > > > > > > > > > > Sax
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> >> > mj...@apache.org
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> wrote:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> Thanks for the KIP
> > > Boyang!
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> Overall,
> categorizing
> > > > > > exceptions
> > > > > > > > > > makes a
> > > > > > > > > > > > lot
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > > > > >> As I
> > > > > > > > > > > > > > > > > > >> >> > > don't
> > > > > > > > > > > > > > > > > > >> >> > > > > know
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> the producer
> internals
> > > well
> > > > > > > > enough,
> > > > > > > > > I
> > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > > comment
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> categorization in
> > detail
> > > > > > though.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> What I am wondering
> > is,
> > > if
> > > > > we
> > > > > > > > should
> > > > > > > > > > > > > introduce
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > >> exception
> > > > > > > > > > > > > > > > > > >> >> > > > > interface
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> that non-fatal
> > exception
> > > > > would
> > > > > > > > > > implement
> > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> >> creating a
> > > > > > > > > > > > > > > > > > >> >> > > new
> > > > > > > > > > > > > > > > > > >> >> > > > > >> class
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> that will wrap
> > non-fatal
> > > > > > > > exceptions?
> > > > > > > > > > > What
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > be the
> > > > > > > > > > > > > > > > > > >> >> > pros/cons
> > > > > > > > > > > > > > > > > > >> >> > > > for
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> both designs?
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> -Matthias
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> On 12/2/20 11:35 AM,
> > > Boyang
> > > > > > Chen
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> Hey there,
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> I would like to
> > start a
> > > > > > > > discussion
> > > > > > > > > > > thread
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > KIP-691:
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-691%3A+Enhance+Transactional+Producer+Exception+Handling
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> The KIP is aiming
> to
> > > > > simplify
> > > > > > > the
> > > > > > > > > > > > exception
> > > > > > > > > > > > > > > > handling
> > > > > > > > > > > > > > > > > > >> logic
> > > > > > > > > > > > > > > > > > >> >> > for
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> transactional
> > Producer
> > > > > users
> > > > > > by
> > > > > > > > > > > > classifying
> > > > > > > > > > > > > > > > fatal and
> > > > > > > > > > > > > > > > > > >> >> > non-fatal
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> exceptions
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> and throw them
> > > > > > correspondingly
> > > > > > > > for
> > > > > > > > > > > easier
> > > > > > > > > > > > > > catch
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> retry.
> > > > > > > > > > > > > > > > > > >> >> > Let
> > > > > > > > > > > > > > > > > > >> >> > > me
> > > > > > > > > > > > > > > > > > >> >> > > > > >> know
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>> what
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> you think.
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> Best,
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>> Boyang
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >> >
> > > > > > > > > > > > > > > > > > >> >> > > > > >>
> > > > > > > > > > > > > > > > > > >> >> > > > > >
> > > > > > > > > > > > > > > > > > >> >> > > > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > > > --
> > > > > > > > > > > > > > > > > > >> >> > > > -- Guozhang
> > > > > > > > > > > > > > > > > > >> >> > > >
> > > > > > > > > > > > > > > > > > >> >> > >
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >> > --
> > > > > > > > > > > > > > > > > > >> >> > -- Guozhang
> > > > > > > > > > > > > > > > > > >> >> >
> > > > > > > > > > > > > > > > > > >> >>
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > --
> > > > > > > > > > > > > > > > > > >> > -- Guozhang
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > >> -- Guozhang
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > -- Guozhang
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > -- Guozhang
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to