Ksnz commented on PR #24300: URL: https://github.com/apache/pulsar/pull/24300#issuecomment-3615698028
Okay, let’s go back to the square one. We have this description: > When you enable replicated subscriptions, you're creating a consistent distributed snapshot to establish an association between message ids from different clusters. The snapshots are taken periodically. The default value is `1 second`. It means that a consumer failing over to a different cluster can potentially receive 1 second of duplicates. You can also configure the frequency of the snapshot in the `broker.conf` file. In my opinion, a one-second window is an acceptable limitation, even if a burst of messages occurs during that second it is still **one second**. The problem is that we evict old snapshots from the head of the queue (the oldest positions), so **slow consumers** can no longer move their `markDelete` position because the required snapshot is already gone. **_Perhaps_** if we added **an option** to make the cache act as a buffer (skipping all new entries once it’s full) it would help slow subscriptions stay in sync. Moreover, it would be **_generic approach_** After a successful advanced `markDelete` position, the buffer would be cleared up to some point and refilled with new `ReplicatedSubscriptionsSnapshots`. The queue that is aligned to a `markDelete` position is more useful for synchronization than a queue that follows the tip of the topic. A remaining issue is that a slow consumer can suddenly advance its `markDelete` position by a large amount, wiping the entire buffer. The next snapshots will then be captured far ahead of the consumer’s new `markDelete`, leaving a gap. But we can combine the best of both worlds. By adding the `private final NavigableMap<Position, ReplicatedSubscriptionsSnapshot> tailSnapshots; ` and `private final NavigableMap<Position, ReplicatedSubscriptionsSnapshot> headSnapshots; ` to a `ReplicatedSubscriptionSnapshotCache` It's not a perfect solution. But it's less evil than rewriting everything and still better than the current one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
