void-ptr974 opened a new issue, #25861:
URL: https://github.com/apache/pulsar/issues/25861

   ### Problem
   
   Persistent geo-replication V2 deduplication uses the source topic position 
as the target-side dedup watermark. For a replicated message, the source 
replicator adds 
`__MSG_PROP_REPL_SOURCE_POSITION=<source-ledger-id>:<source-entry-id>`, and the 
target broker stores the latest replicated source position as:
   
   - `<replicator-producer>_LID`
   - `<replicator-producer>_EID`
   
   This state is not normal producer sequence state. It is the target-side 
checkpoint used to identify whether a replayed source entry has already been 
persisted.
   
   There are three related issues:
   
   1. The geo V2 watermark is not recovered from replayed target entries.
   2. The geo V2 watermark is stored as two separate snapshot keys, but `_LID` 
and `_EID` must be restored together.
   3. The geo V2 watermark can be removed by normal producer inactivity 
cleanup, even though it is source-position state rather than producer lifecycle 
state.
   
   ### Failure window
   
   1. Source replicator sends messages to the target.
   2. Target persists the messages and updates the in-memory geo V2 dedup 
watermark.
   3. The target topic is unloaded before the latest dedup snapshot includes 
the watermark.
   4. The source replicator has not durably advanced its replication cursor yet.
   5. Source reconnects and replays the same source entries.
   6. Target reloads without the geo watermark and can write duplicates.
   
   ### Impact
   
   This can produce duplicate messages on the target cluster during normal 
recovery/failover paths. It does not require client misuse or corrupted input. 
Source cursor replay is an expected recovery behavior, so target-side geo dedup 
state must survive topic reload and producer inactivity cleanup.
   
   ### Proposed fix
   
   - Recover geo-replication V2 dedup watermarks while replaying the dedup 
cursor by reading `replicatedFrom` and `__MSG_PROP_REPL_SOURCE_POSITION` from 
persisted replicated messages.
   - Store geo-replication watermark keys as complete `_LID/_EID` pairs in 
dedup snapshots.
   - Keep geo-replication watermark state during producer inactivity cleanup.
   
   A proposed fix is available in #25860.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to