Hey Alieh,

I see there are two options now. So folks will be discussing the approaches
and deciding the best way forward before we vote?
I do think there could be a problem with the approach on commit if we get
stuck on an earlier error and have more records (potentially on new
partitions) to commit as the current PR is implemented.

I guess this takes us back to the question of whether the error should be
cleared on send.

(And I guess at the back of my mind, I'm wondering if there is a way we can
validate the "posion pill" records application side before we even try to
send them)

Justine

On Wed, Jun 26, 2024 at 4:38 PM Alieh Saeedi <asae...@confluent.io.invalid>
wrote:

> Hi Justine,
>
> I did not update the KIP with `TxnSendOption` since I thought it'd be
> better discussed here beforehand.
> right now, there are 2 PRs:
> - the PR that implements the current version of the KIP:
> https://github.com/apache/kafka/pull/16332
> - the POC PR that clarifies the `TxnSendOption`:
> https://github.com/apache/kafka/pull/16465
>
> Bests,
> Alieh
>
> On Thu, Jun 27, 2024 at 12:42 AM Justine Olshan
> <jols...@confluent.io.invalid> wrote:
>
> > Hey Alieh,
> >
> > I think I am a little confused. Are the 3 points above addressed by the
> KIP
> > or did something change? The PR seems to not include this change and
> still
> > has the CommitOption as well.
> >
> > Thanks,
> > Justine
> >
> > On Wed, Jun 26, 2024 at 2:15 PM Alieh Saeedi
> <asae...@confluent.io.invalid
> > >
> > wrote:
> >
> > > Hi all,
> > >
> > >
> > > Looking at the PR <https://github.com/apache/kafka/pull/16332>
> > > corresponding to the KIP, there are some points worthy of mention:
> > >
> > >
> > > 1) clearing the error ends up dirty/messy code in `TransactionManager`.
> > >
> > > 2) By clearing the error, we are actually doing an illegal transition
> > from
> > > `ABORTABLE_ERROR` to `IN_TRANSACTION` which is conceptually not
> > acceptable.
> > > This can be the root cause of some issues, with perhaps further future
> > > changes by others.
> > >
> > > 3) If the poison pill record `r1` causes a transition to the error
> state
> > > and then the next record `r2` requires adding a partition to the
> > > transaction, the action fails due to being in the error state. In this
> > > case, clearing errors during `commitTnx(CLEAR_SEND_ERROR)` is too late.
> > > However, this case can NOT be the main concern as soon as KIP-890 is
> > fully
> > > implemented.
> > >
> > >
> > > My suggestion is to solve the problem where it arises. If the
> transition
> > to
> > > the error state does not happen during `send()`, we do not need to
> clear
> > > the error later. Therefore, instead of `CommitOption`, we can define a
> > > `TxnSendOption` and prevent the `send()` method from going to the error
> > > state in case 1) we're in a transaction and 2) the user asked for
> > > IGONRE_SEND_ERRORS. For more clarity, you can take a look at the POC PR
> > > <https://github.com/apache/kafka/pull/16465>.
> > >
> > > Cheers,
> > > Alieh
> > >
> >
>

Reply via email to