Hi Rowland,

Thank you for your feedback.  Using an explicit prepare RPC was discussed
and is listed in the rejected alternatives:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC#KIP939:SupportParticipationin2PC-Explicit%E2%80%9Cprepare%E2%80%9DRPC.
Basically, even if we had an explicit prepare RPC, it doesn't avoid the
fact that a crashed client could cause a blocking transaction.  This is,
btw, is not just a specific property of this concrete proposal, it's a
fundamental trade off of any form of 2PC -- any 2PC implementation must
allow for infinitely "in-doubt" transactions that may not be unilaterally
automatically resolved within one participant.

To mitigate the issue, using 2PC requires a special permission, so that the
Kafka admin could control that only applications that follow proper
standards in terms of availability (i.e. will automatically restart and
cleanup after a crash) would be allowed to utilize 2PC.  It is also assumed
that any practical deployment utilizing 2PC would have monitoring set up,
so that an operator could be alerted to investigate and manually resolve
in-doubt transactions (the metric and tooling support for doing so are also
described in the KIP).

For XA support, I wonder if we could take a layered approach and store XA
information in a separate store, say in a compacted topic.  This way, the
core Kafka protocol could be decoupled from specific implementations (and
extra performance requirements that a specific implementation may impose)
and serve as a foundation for multiple implementations.

-Artem

On Sun, Feb 4, 2024 at 1:37 PM Rowland Smith <rowl...@gmail.com> wrote:

> Hi Artem,
>
> It has been a while, but I have gotten back to this. I understand that when
> 2PC is used, the transaction timeout will be effectively infinite. I don't
> think that this behavior is desirable. A long running transaction can be
> extremely disruptive since it blocks consumers on any partitions written to
> within the pending transaction. The primary reason for a long running
> transaction is a failure of the client, or the network connecting the
> client to the broker. If such a failure occurs before the client calls
> the new prepareTransaction() method, it should be OK to abort the
> transaction after a relatively short timeout period. This approach would
> minimize the inconvenience and disruption of a long running transaction
> blocking consumers, and provide higher availability for a system using
> Kafka.
>
> In order to achieve this behavior, I think we would need a 'prepare' RPC
> call so that the server knows that a transaction has been prepared, and
> does not timeout and abort such transactions. There will be some cost to
> this extra RPC call, but there will also be a benefit of better system
> availability in case of failures.
>
> There is another reason why I would prefer this implementation. I am
> working on an XA KIP, and XA requires that Kafka brokers be able to provide
> a list of prepared transactions during recovery.  The broker can only know
> that a transaction has been prepared if an RPC call is made., so my KIP
> will need this functionality. In the XA KIP, I would like to use as much of
> the KIP-939 solution as possible, so it would be helpful if
> prepareTransactions() sent a 'prepare' RPC, and the broker recorded the
> prepared transaction state.
>
> This could be made configurable behavior if we are concerned that the cost
> of the extra RPC call is too much, and that some users would prefer to have
> speed in exchange for less system availability in some cases of client or
> network failure.
>
> Let me know what you think.
>
> -Rowland
>
> On Fri, Jan 5, 2024 at 8:03 PM Artem Livshits
> <alivsh...@confluent.io.invalid> wrote:
>
> > Hi Rowland,
> >
> > Thank you for the feedback.  For the 2PC cases, the expectation is that
> the
> > timeout on the client would be set to "effectively infinite", that would
> > exceed all practical 2PC delays.  Now I think that this flexibility is
> > confusing and can be misused, I have updated the KIP to just say that if
> > 2PC is used, the transaction never expires.
> >
> > -Artem
> >
> > On Thu, Jan 4, 2024 at 6:14 PM Rowland Smith <rowl...@gmail.com> wrote:
> >
> > > It is probably me. I copied the original message subject into a new
> > email.
> > > Perhaps that is not enough to link them.
> > >
> > > It was not my understanding from reading KIP-939 that we are doing away
> > > with any transactional timeout in the Kafka broker. As I understand it,
> > we
> > > are allowing the application to set the transaction timeout to a value
> > that
> > > exceeds the *transaction.max.timeout.ms
> > > <http://transaction.max.timeout.ms>* setting
> > > on the broker, and having no timeout if the application does not set
> > > *transaction.timeout.ms
> > > <http://transaction.timeout.ms>* on the producer. The KIP says that
> the
> > > semantics of *transaction.timeout.ms <http://transaction.timeout.ms>*
> > are
> > > not being changed, so I take that to mean that the broker will continue
> > to
> > > enforce a timeout if provided, and abort transactions that exceed it.
> > From
> > > the KIP:
> > >
> > > Client Configuration Changes
> > >
> > > *transaction.two.phase.commit.enable* The default would be ‘false’.  If
> > set
> > > to ‘true’, then the broker is informed that the client is participating
> > in
> > > two phase commit protocol and can set transaction timeout to values
> that
> > > exceed *transaction.max.timeout.ms <http://transaction.max.timeout.ms
> >*
> > > setting
> > > on the broker (if the timeout is not set explicitly on the client and
> the
> > > two phase commit is set to ‘true’ then the transaction never expires).
> > >
> > > *transaction.timeout.ms <http://transaction.timeout.ms>* The semantics
> > is
> > > not changed, but it can be set to values that exceed
> > > *transaction.max.timeout.ms
> > > <http://transaction.max.timeout.ms>* if two.phase.commit.enable is set
> > to
> > > ‘true’.
> > >
> > >
> > > Thinking about this more I believe we would also have a possible race
> > > condition if the broker is unaware that a transaction has been
> prepared.
> > > The application might call prepare and get a positive response, but the
> > > broker might have already aborted the transaction for exceeding the
> > > timeout. It is a general rule of 2PC that once a transaction has been
> > > prepared it must be possible for it to be committed or aborted. It
> seems
> > in
> > > this case a prepared transaction might already be aborted by the
> broker,
> > so
> > > it would be impossible to commit.
> > >
> > > I hope this is making sense and I am not misunderstanding the KIP.
> Please
> > > let me know if I am.
> > >
> > > - Rowland
> > >
> > >
> > > On Thu, Jan 4, 2024 at 12:56 PM Justine Olshan
> > > <jols...@confluent.io.invalid>
> > > wrote:
> > >
> > > > Hey Rowland,
> > > >
> > > > Not sure why this message showed up in a different thread from the
> > other
> > > > KIP-939 discussion (is it just me?)
> > > >
> > > > In KIP-939, we do away with having any transactional timeout on the
> > Kafka
> > > > side. The external coordinator is fully responsible for controlling
> > > whether
> > > > the transaction completes.
> > > >
> > > > While I think there is some use in having a prepare stage, I just
> > wanted
> > > to
> > > > clarify what the current KIP is proposing.
> > > >
> > > > Thanks,
> > > > Justine
> > > >
> > > > On Wed, Jan 3, 2024 at 7:49 PM Rowland Smith <rowl...@gmail.com>
> > wrote:
> > > >
> > > > > Hi Artem,
> > > > >
> > > > > I saw your response in the thread I started discussing Kafka
> > > distributed
> > > > > transaction support and the XA interface. I would like to work with
> > you
> > > > to
> > > > > add XA support to Kafka on top of the excellent foundational work
> > that
> > > > you
> > > > > have started with KIP-939. I agree that explicit XA support should
> > not
> > > be
> > > > > included in the Kafka codebase as long as the right set of basic
> > > > operations
> > > > > are provided. I will begin pulling together a KIP to follow
> KIP-939.
> > > > >
> > > > > I did have one comment on KIP-939 itself. I see that you considered
> > an
> > > > > explicit "prepare" RPC, but decided not to add it. If I understand
> > your
> > > > > design correctly, that would mean that a 2PC transaction would
> have a
> > > > > single timeout that would need to be long enough to ensure that
> > > prepared
> > > > > transactions are not aborted when an external coordinator fails.
> > > However,
> > > > > this also means that an unprepared transaction would not be aborted
> > > > without
> > > > > waiting for the same timeout. Since long running transactions block
> > > > > transactional consumers, having a long timeout for all transactions
> > > could
> > > > > be disruptive. An explicit "prepare " RPC would allow the server to
> > > abort
> > > > > unprepared transactions after a relatively short timeout, and
> apply a
> > > > much
> > > > > longer timeout only to prepared transactions. The explicit
> "prepare"
> > > RPC
> > > > > would make Kafka server more resilient to client failure at the
> cost
> > of
> > > > an
> > > > > extra synchronous RPC call. I think its worth reconsidering this.
> > > > >
> > > > > With an XA implementation this might become a more significant
> issue
> > > > since
> > > > > the transaction coordinator has no memory of unprepared
> transactions
> > > > across
> > > > > restarts. Such transactions would need to be cleared by hand
> through
> > > the
> > > > > admin client even when the transaction coordinator restarts
> > > successfully.
> > > > >
> > > > > - Rowland
> > > > >
> > > >
> > >
> >
>
>
> --
> *Rowland E. Smith*
> P: (862) 260-4163
> M: (201) 396-3842
>

Reply via email to