Hi,
Rollback does result in gaps in the offsets that a read-committed consumer 
sees. Log compaction can also result in gaps in the offsets.

I am not aware of any way to force the “cleanup” of transactions.

Thanks,
Andrew

> On 29 Jun 2023, at 09:08, Henry GALVEZ <henry.gal...@intm.fr> wrote:
>
> Hi Andrew,
>
> Yes, I have been using the requireStable option in the consumer group, but I 
> consistently encounter the same issue.
>
> If I understand Kafka's logic correctly, it is essential for Kafka to retain 
> those offsets even in the case of rollbacks. This is why relying on offsets 
> within the application logic is not reliable.
>
> I need to explain this to my colleagues at work and would like to provide 
> some context to the SpringKafka developer.
>
> Do you think the scenario of a rollback is similar to the effects of Log 
> Compaction?
> https://kafka.apache.org/documentation/#design_compactionbasics
>
> Furthermore, is there a way to force the cleanup of transactions? Could this 
> potentially help address the issue?
>
> Cordially,
> Henry
>
> ________________________________
> De: Andrew Schofield <andrew_schofield_j...@outlook.com>
> Enviado: miércoles, 28 de junio de 2023 14:54
> Para: dev@kafka.apache.org <dev@kafka.apache.org>
> Asunto: Re: Offsets: consumption and production in rollback
>
> Hi Henry,
> Consumers get to choose an isolation level. There’s one instance I can think 
> of where AdminClient also has
> some ability to let users choose how to deal with uncommitted data. If you’ve 
> not seen KIP-851 take a look:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-851%3A+Add+requireStable+flag+into+ListConsumerGroupOffsetsOptions<https://cwiki.apache.org/confluence/display/KAFKA/KIP-851%3A+Add+requireStable+flag+into+ListConsumerGroupOffsetsOptions>
> By your question, I expect you have seen it.
>
> Control records are kind of invisibly lurking like dark matter. The approach 
> I’ve taken with this kind of thing
> is to cope with the fact that the offsets of the real records are increasing 
> but not necessarily consecutive.
> If I’m using READ_COMMITTED isolation level, there are also gaps when 
> transactions roll back.
> I design my consumers so they are not surprised by the gaps and they don’t 
> try to calculate the number
> of records. Instead, they just continually consume.
>
> Hope this helps,
> Andrew
>
>> On 28 Jun 2023, at 09:28, Henry GALVEZ <henry.gal...@intm.fr> wrote:
>>
>> Hi Andrew,
>>
>> Thank you for your response.
>>
>> I understand your explanation, but in both cases, I am using an "isolation 
>> level" of READ_COMMITTED. I believe the issue lies in the 
>> AdminClient.listOffsets method, as it may not be considering the isolation 
>> level, where as the consumer of AdminClient.listConsumerGroupOffsets does 
>> consider it.
>>
>> What are your thoughts on this?
>>
>> Additionally, would it be more suitable to implement a solution that reverts 
>> the offsets in case of transaction rollbacks? It's possible that there's a 
>> logic aspect I'm not fully grasping.
>>
>> Perhaps I need to utilize the internal control records and their offsets. 
>> Could you point me in the right direction for their documentation?
>>
>> Thank you,
>> Henry
>>
>> ________________________________
>> De: Andrew Schofield <andrew_schofield_j...@outlook.com>
>> Enviado: martes, 27 de junio de 2023 13:22
>> Para: dev@kafka.apache.org <dev@kafka.apache.org>
>> Asunto: Fwd: Offsets: consumption and production in rollback
>>
>> Hi Henry,
>> Thanks for your message.
>>
>> Kafka transactions are a bit unusual. If you produce a message inside a 
>> transaction, it is assigned an offset on a topic-partition before
>> the transaction even commits. That offset is not “revoked” if the 
>> transaction rolls back.
>>
>> This is why the consumer has the concept of “isolation level”. It 
>> essentially controls whether the consumer can “see” the
>> uncommitted or even rolled back messages.
>>
>> A consumer using the committed isolation level only consumes committed 
>> messages, but the offsets that it observes do
>> reflect the uncommitted messages. So, if you observe the progress of the 
>> offsets of the records consumed, you see that they
>> skip the messages that were produced but then rolled back. There are also 
>> invisible control records that are used to achieve
>> transactional behaviour, and those also have offsets.
>>
>> I’m not sure that this is really “bogus lag” but, when you’re using 
>> transactions, there’s not a one-to-one relationship
>> between offsets and consumable records.
>>
>> Hope this helps,
>> Andrew
>>
>> Begin forwarded message:
>>
>> From: Henry GALVEZ <henry.gal...@intm.fr>
>> Subject: Offsets: consumption and production in rollback
>> Date: 27 June 2023 at 10:48:31 BST
>> To: "us...@kafka.apache.org" <us...@kafka.apache.org>, 
>> "dev@kafka.apache.org" <dev@kafka.apache.org>
>> Reply-To: dev@kafka.apache.org
>>
>> I have some doubts regarding message consumption and production, as well as 
>> transactional capabilities. I am using a Kafka template to produce a message 
>> within a transaction. After that, I execute another transaction that 
>> produces a message and intentionally throws a runtime exception to simulate 
>> a transaction rollback.
>>
>> Next, I use the Kafka AdminClient to retrieve the latest offset for the 
>> topic partition and the consumer group's offsets for the same topic 
>> partition. However, when I compare the offset numbers, I notice a 
>> difference. In this example, the consumer has 4 offsets, while the topic has 
>> only 2.
>>
>> I have come across references to this issue in a Spring-Kafka report, 
>> specifically in the Kafka-10683 report, where developers describe it as 
>> either Bogus or Pseudo Lag.
>>
>> I am keen on resolving this problem, and I would greatly appreciate hearing 
>> about your experiences and knowledge regarding this matter.
>>
>> Thank you very much
>> Henry
>>
>

Reply via email to