Hi folks,

Awesome work you have been doing on this project!

I was hoping I could get some help on an issue we are having in one of our
Kafka clusters. Most of the clients on this cluster use
exactly-once-semantics. The Kafka cluster currently runs version 3.5.2 and
we were attempting an upgrade to 3.6.2. After replacing one of the brokers
with the new version we saw a bunch of the following errors on the older
brokers:

```
Received request api key ADD_PARTITIONS_TO_TXN with version 4 which is not
enabled
```

This manifested as 'NETWORK_EXCEPTION' errors on the clients and downtime
for those clients. On the new broker we saw:

```
[AddPartitionsToTxnSenderThread-1063]: AddPartitionsToTxnRequest failed for
node 1069 with a network exception.
```

Digging through the changes in 3.6, we came across some changes introduced
as part of KAFKA-14402 <https://issues.apache.org/jira/browse/KAFKA-14402> that
we thought might lead to this behaviour and wanted to confirm.

First we could see that  transaction.partition.verification.enable
is enabled by default and enables a new code path that culminates in we
sending version 4 ADD_PARTITIONS_TO_TXN requests to other brokers here
<https://github.com/apache/kafka/blob/cb35ddc5ca233d5cca6f51c1c41b952a7e9fe1a0/core/src/main/scala/kafka/server/AddPartitionsToTxnManager.scala#L269>
.

However, we do not support  version 4 of ADD_PARTITIONS_TO_TXN requests as
of Kafka 3.5.2? If these assumptions happen to be correct, does this mean
that the upgrade to versions 3.6+ require
transaction.partition.verification.enable
to be set to false to allow upgrades?

Regard,
Johnson

Reply via email to