Hi,

Our data getting into Kafka is transactional in nature and hence I am trying to 
understand EOS better. My present understanding is as below:

It is mentioned that when producer starts, it will have a new PID, but only 
valid till the session. Does that mean, is it a pre-requisite to have the same 
/ single producer session for exactly-once guarantees? I presume it is not 
required. As per my understanding, this is where transactionl.id comes into 
picture which is user defined and hence can survive producer restarts.

I have few questions regarding the same:

1. If the above statement is correct, why do we need PID in the first place and 
instead use transactionl.id all over?
2. I understand that sequence number is something that is generated by producer 
and increases monotonically. Does that mean, the sequence number changes across 
producer restarts along with a new PID?
3. Is PID meant mainly for idempotence where as transactional.id is for 
transactional support? If so, when the producer restarts, will idempotence not 
get impacted. 
4. On the consumer side, only one config parameter is defined i.e. 
isolation.level. For EOS, I presume this needs to be set to ‘read_committed’ 
only. For EOS, it should never be set to ‘read_uncommitted’
5. What is the impact of setting ‘enable.idempotence’ to true without setting 
‘transactional.id’ on the producer side? Does it have any (side)effect?
6. How does EOS work for compacted topics? Will the EOS behaviour be any 
different for compacted topics?
7. How does EOS work when transactions are written to two different log 
segments?

Can anyone please help me understand the nuances around EOS guarantees?

Thanks
Vijay

Reply via email to