Re: [DISCUSS] KIP-890 Server Side Defense

Justine Olshan Fri, 12 Jan 2024 15:11:02 -0800

Hi Jun,

In the case you describe, we would need to have a delayed request, send a
successful EndTxn, and a successful AddPartitionsToTxn and then have the
delayed EndTxn request go through for a given producer.
I'm trying to figure out if it is possible for the client to transition if
a previous request is delayed somewhere. But yes, in this case I think we
would fence the client.


Not for the overflow case. In the overflow case, the producer ID and the
epoch are different on the marker and on the new transaction. So we want
the marker to use the max epoch  but the new transaction should start with
the new ID and epoch 0 in the transactional state.

In the server downgrade case, we want to see the producer ID as that is
what the client will have. If we complete the commit, and the transaction
state is reloaded, we need the new producer ID in the state so there isn't
an invalid producer ID mapping.
The server downgrade cases are considering transactional correctness and
not regressing from previous behavior -- and are not concerned about
supporting the safety from fencing retries (as we have downgraded so we
don't need to support). Perhaps this is a trade off, but I think it is the
right one.

(If the client downgrades, it will have restarted and it is ok for it to
have a new producer ID too).

Justine

On Fri, Jan 12, 2024 at 11:42 AM Jun Rao <j...@confluent.io.invalid> wrote:

> Hi, Justine,
>
> Thanks for the reply.
>
> 101.4 "If the marker is written by the new client, we can as I mentioned in
> the last email guarantee that any EndTxn requests with the same epoch are
> from the same producer and the same transaction. Then we don't have to
> return a fenced error but can handle gracefully as described in the KIP."
> When a delayed EndTnx request is processed, the txn state could be ongoing
> for the next txn. I guess in this case we still return the fenced error for
> the delayed request?
>
> 102. Sorry, my question was inaccurate. What you described is accurate.
> "The downgrade compatibility I mention is that we keep the same producer ID
> and epoch in the main (non-tagged) fields as we did before the code on the
> server side." If we want to do this, it seems that we should use the
> current produce Id and max epoch in the existing producerId and
> producerEpoch fields for both the prepare and the complete marker, right?
> The downgrade can happen after the complete marker is written. With what
> you described, the downgraded coordinator will see the new produce Id
> instead of the old one.
>
> Jun
>
> On Fri, Jan 12, 2024 at 10:44 AM Justine Olshan
> <jols...@confluent.io.invalid> wrote:
>
> > Hi Jun,
> >
> > I can update the description.
> >
> > I believe your second point is mentioned in the KIP. I can add more text
> on
> > this if it is helpful.
> > > The delayed message case can also violate EOS if the delayed message
> > comes in after the next addPartitionsToTxn request comes in. Effectively
> we
> > may see a message from a previous (aborted) transaction become part of
> the
> > next transaction.
> >
> > If the marker is written by the new client, we can as I mentioned in the
> > last email guarantee that any EndTxn requests with the same epoch are
> from
> > the same producer and the same transaction. Then we don't have to return
> a
> > fenced error but can handle gracefully as described in the KIP.
> > I don't think a boolean is useful since it is directly encoded by the
> > existence or lack of the tagged field being written.
> > In the prepare marker we will have the same producer ID in the non-tagged
> > field. In the Complete state we may not.
> > I'm not sure why the ongoing state matters for this KIP. It does matter
> for
> > KIP-939.
> >
> > I'm not sure what you are referring to about writing the previous
> producer
> > ID in the prepare marker. This is not in the KIP.
> > In the overflow case, we write the nextProducerId in the prepare state.
> > This is so we know what we assigned when we reload the transaction log.
> > Once we complete, we transition this ID to the main (non-tagged field)
> and
> > have the previous producer ID field filled in. This is so we can identify
> > in a retry case the operation completed successfully and we don't fence
> our
> > producer. The downgrade compatibility I mention is that we keep the same
> > producer ID and epoch in the main (non-tagged) fields as we did before
> the
> > code on the server side. If the server downgrades, we are still
> compatible.
> > This addresses both the prepare and complete state downgrades.
> >
> > Justine
> >
> > On Fri, Jan 12, 2024 at 10:21 AM Jun Rao <j...@confluent.io.invalid>
> wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the reply. Sorry for the delay. I have a few more comments.
> > >
> > > 110. I think the motivation section could be improved. One of the
> > > motivations listed by the KIP is "This can happen when a message gets
> > stuck
> > > or delayed due to networking issues or a network partition, the
> > transaction
> > > aborts, and then the delayed message finally comes in.". This seems not
> > > very accurate. Without KIP-890, currently, if the coordinator times out
> > and
> > > aborts an ongoing transaction, it already bumps up the epoch in the
> > marker,
> > > which prevents the delayed produce message from being added to the user
> > > partition. What can cause a hanging transaction is that the producer
> > > completes (either aborts or commits) a transaction before receiving a
> > > successful ack on messages published in the same txn. In this case,
> it's
> > > possible for the delayed message to be appended to the partition after
> > the
> > > marker, causing a transaction to hang.
> > >
> > > A similar issue (not mentioned in the motivation) could happen on the
> > > marker in the coordinator's log. For example, it's possible for an
> > > EndTxnRequest to be delayed on the coordinator. By the time the delayed
> > > EndTxnRequest is processed, it's possible that the previous txn has
> > already
> > > completed and a new txn has started. Currently, since the epoch is not
> > > bumped on every txn, the delayed EndTxnRequest will add an unexpected
> > > prepare marker (and eventually a complete marker) to the ongoing txn.
> > This
> > > won't cause the transaction to hang, but it will break the EoS
> semantic.
> > > The proposal in this KIP will address this issue too.
> > >
> > > 101. "However, I was writing it so that we can distinguish between
> > > old clients where we don't have the ability do this operation and new
> > > clients that can. (Old clients don't bump the epoch on commit, so we
> > can't
> > > say for sure the write belongs to the given transaction)."
> > > 101.1 I am wondering why we need to distinguish whether the marker is
> > > written by the old and the new client. Could you describe what we do
> > > differently if we know the marker is written by the new client?
> > > 101.2 If we do need a way to distinguish whether the marker is written
> by
> > > the old and the new client. Would it be simpler to just introduce a
> > boolean
> > > field instead of indirectly through the previous produce ID field?
> > > 101.3 It's not clear to me why we only add the previous produce ID
> field
> > in
> > > the complete marker, but not in the prepare marker. If we want to know
> > > whether a marker is written by the new client or not, it seems that we
> > want
> > > to do this consistently for all markers.
> > > 101.4 What about the TransactionLogValue record representing the
> ongoing
> > > state? Should we also distinguish whether it's written by the old or
> the
> > > new client?
> > >
> > > 102. In the overflow case, it's still not clear to me why we write the
> > > previous produce Id in the prepare marker while writing the next
> produce
> > Id
> > > in the complete marker. You mentioned that it's for downgrading.
> However,
> > > we could downgrade with either the prepare marker or the complete
> marker.
> > > In either case, the downgraded coordinator should see the same produce
> id
> > > (probably the previous produce Id), right?
> > >
> > > Jun
> > >
> > > On Wed, Dec 20, 2023 at 6:00 PM Justine Olshan
> > > <jols...@confluent.io.invalid>
> > > wrote:
> > >
> > > > Hey Jun,
> > > >
> > > > Thanks for taking a look at the KIP again.
> > > >
> > > > 100. For the epoch overflow case, only the marker will have max
> epoch.
> > > This
> > > > keeps the behavior of the rest of the markers where the last marker
> is
> > > the
> > > > epoch of the transaction records + 1.
> > > >
> > > > 101. You are correct that we don't need to write the producer ID
> since
> > it
> > > > is the same. However, I was writing it so that we can distinguish
> > between
> > > > old clients where we don't have the ability do this operation and new
> > > > clients that can. (Old clients don't bump the epoch on commit, so we
> > > can't
> > > > say for sure the write belongs to the given transaction). If we
> receive
> > > an
> > > > EndTxn request from a new client, we will fill this field. We can
> > > guarantee
> > > > that any EndTxn requests with the same epoch are from the same
> producer
> > > and
> > > > the same transaction.
> > > >
> > > > 102. In prepare phase, we have the same producer ID and epoch we
> always
> > > > had. It is the producer ID and epoch that are on the marker. In
> commit
> > > > phase, we stay the same unless it is the overflow case. In that case,
> > we
> > > > set the producer ID to the new one we generated and epoch to 0 after
> > > > complete. This is for downgrade compatibility. The tagged fields are
> > just
> > > > safety guards for retries and failovers.
> > > >
> > > > In prepare phase for epoch overflow case only we store the next
> > producer
> > > > ID. This is for the case where we reload the transaction coordinator
> in
> > > > prepare state. Once the transaction is committed, we can use the
> > producer
> > > > ID the client already is using.
> > > >
> > > > In commit phase, we store the previous producer ID in case of
> retries.
> > > >
> > > > I think it is easier to think of it as just how we were storing
> > producer
> > > ID
> > > > and epoch before, with some extra bookeeping and edge case handling
> in
> > > the
> > > > tagged fields. We have to do it this way for compatibility with
> > > downgrades.
> > > >
> > > > 103. Next producer ID is for prepare status and previous producer ID
> is
> > > for
> > > > after complete. The reason why we need two separate (tagged) fields
> is
> > > for
> > > > backwards compatibility. We need to keep the same semantics for the
> > > > non-tagged field in case we downgrade.
> > > >
> > > > 104. We set the fields as we do in the transactional state (as we
> need
> > to
> > > > do this for compatibility -- if we downgrade, we will only have the
> > > > non-tagged fields) It will be the old producer ID and max epoch.
> > > >
> > > > Hope this helps. Let me know if you have further questions.
> > > >
> > > > Justine
> > > >
> > > > On Wed, Dec 20, 2023 at 3:33 PM Jun Rao <j...@confluent.io.invalid>
> > > wrote:
> > > >
> > > > > Hi, Justine,
> > > > >
> > > > > It seems that you have made some changes to KIP-890 since the vote.
> > In
> > > > > particular, we are changing the format of TransactionLogValue. A
> few
> > > > > comments related to that.
> > > > >
> > > > > 100. Just to be clear. The overflow case (i.e. when a new
> producerId
> > is
> > > > > generated) is when the current epoch equals to max - 1 and not max?
> > > > >
> > > > > 101. For the "not epoch overflow" case, we write the previous ID in
> > the
> > > > > tagged field in the complete phase. Do we need to do that since
> > produce
> > > > id
> > > > > doesn't change in this case?
> > > > >
> > > > > 102. It seems that the meaning for the ProducerId/ProducerEpoch
> > fields
> > > in
> > > > > TransactionLogValue changes depending on the TransactionStatus.
> When
> > > > > the TransactionStatus is ongoing, they represent the current
> > ProducerId
> > > > and
> > > > > the current ProducerEpoch. When the TransactionStatus is
> > > > > PrepareCommit/PrepareAbort, they represent the current ProducerId
> and
> > > the
> > > > > next ProducerEpoch. When the TransactionStatus is Commit/Abort,
> they
> > > > > further depend on whether the epoch overflows or not. If there is
> no
> > > > > overflow, they represent  the current ProducerId and the next
> > > > ProducerEpoch
> > > > > (max). Otherwise, they represent the newly generated ProducerId
> and a
> > > > > ProducerEpoch of 0. Is that right? This seems not easy to
> understand.
> > > > Could
> > > > > we provide some examples like what Artem has done in KIP-939? Have
> we
> > > > > considered a simpler design where ProducerId/ProducerEpoch always
> > > > represent
> > > > > the same value (e.g. for the current transaction) independent of
> the
> > > > > TransactionStatus and epoch overflow?
> > > > >
> > > > > 103. It's not clear to me why we need 3 fields: ProducerId,
> > > > PrevProducerId,
> > > > > NextProducerId. Could we just have ProducerId and NextProducerId?
> > > > >
> > > > > 104. For WriteTxnMarkerRequests, if the producer epoch overflows,
> > what
> > > do
> > > > > we set the producerId and the producerEpoch?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > >
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-890 Server Side Defense

Reply via email to