Vincent Jiang created KAFKA-14347: ------------------------------------- Summary: deleted records may be kept unexpectedly when leader changes while adding a new replica Key: KAFKA-14347 URL: https://issues.apache.org/jira/browse/KAFKA-14347 Project: Kafka Issue Type: Improvement Reporter: Vincent Jiang
Consider that in a compacted topic, a regular record _k1=v1_ is deleted by a later tombstone record {_}k1=null{_}{_}.{_} And imagine that somehow __ log compaction is making different progress on the three replicas, {_}r1{_}, _r2_ and _r3:_ _-_ on replica {_}r1{_}, log compaction has not cleaned _k1=v1_ or _k1=null_ yet. - on replica {_}r2{_}, log compaction cleaned and removed both _k1=v1_ and _k1=null._ In this case, following sequence can cause record _k1=v1_ being kept unexpectedly: 1. Replica _r3_ is re-assigned to a different node and starts to replicate data from leader. 2. At the beginning, _r1_ is the leader, so _r3_ replicates record _k1=v1_ from {_}r1{_}. 3. Before _k1=null_ is replicated from {_}r1{_}, leader changes to {_}r2{_}. 4. _r3_ replicates data from {_}r2{_}. Because _k1=null_ record has been cleaned in {_}r2{_}, it will not be replicated. As a result, _r3_ has record _k1=v1_ but not {_}k1=null{_}. -- This message was sent by Atlassian Jira (v8.20.10#820010)