[ 
https://issues.apache.org/jira/browse/KAFKA-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated KAFKA-15202:
--------------------------------
    Description: 
The spacing between OffsetSyncs can vary significantly, due to conditions in 
the upstream topic and in the replication rate of the MirrorSourceTask.

The OffsetSyncStore attempts to keep a maximal number of distinct syncs 
present, and for regularly spaced syncs it does not allow an incoming sync to 
expire more than one other unique sync. There are tests to enforce this 
property.

For variable spaced syncs, there is no such guarantee, because multiple 
fine-grained syncs may need to be expired at the same time. However, instead of 
only those fine-grained syncs being expired, the store may also expire 
coarser-grained syncs. This causes a large decrease in the number of unique 
syncs.

This is an extremely simple example: Syncs: 0 (start), 1, 2, 4.

The result:
{noformat}
TRACE New sync OffsetSync{topicPartition=topic1-2, upstreamOffset=1, 
downstreamOffset=1} applied, new state is [1:1,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
TRACE New sync OffsetSync{topicPartition=topic1-2, upstreamOffset=2, 
downstreamOffset=2} applied, new state is [2:2,1:1,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
TRACE New sync OffsetSync{topicPartition=topic1-2, upstreamOffset=4, 
downstreamOffset=4} applied, new state is [4:4,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194){noformat}
Instead of being expired, the 2:2 sync should still be present in the final 
state, allowing the store to maintain 3 unique syncs.

 

  was:
The spacing between OffsetSyncs can vary significantly, due to conditions in 
the upstream topic and in the replication rate of the MirrorSourceTask.

The OffsetSyncStore attempts to keep a maximal number of distinct syncs 
present, and for regularly spaced syncs it does not allow an incoming sync to 
expire more than one other unique sync. There are tests to enforce this 
property.

For variable spaced syncs, there is no such guarantee, because multiple 
fine-grained syncs may need to be expired at the same time. However, instead of 
only those fine-grained syncs being expired, the store may also expire 
coarser-grained syncs. This causes a large decrease in the number of unique 
syncs.

This is an extremely simple example:

* Syncs: 0 (start), 1, 2, 4.
The result:
```
TRACE New sync OffsetSync\{topicPartition=topic1-2, upstreamOffset=1, 
downstreamOffset=1} applied, new state is [1:1,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
TRACE New sync OffsetSync\{topicPartition=topic1-2, upstreamOffset=2, 
downstreamOffset=2} applied, new state is [2:2,1:1,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
TRACE New sync OffsetSync\{topicPartition=topic1-2, upstreamOffset=4, 
downstreamOffset=4} applied, new state is [4:4,0:0] 
(org.apache.kafka.connect.mirror.OffsetSyncStore:194)
```
Instead of being expired, the `2:2` sync should still be present in the final 
state, allowing the store to maintain 3 unique syncs.


> MM2 OffsetSyncStore clears too many syncs when sync spacing is variable
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-15202
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15202
>             Project: Kafka
>          Issue Type: Bug
>          Components: mirrormaker
>    Affects Versions: 3.5.0, 3.4.1, 3.3.3
>            Reporter: Greg Harris
>            Priority: Major
>
> The spacing between OffsetSyncs can vary significantly, due to conditions in 
> the upstream topic and in the replication rate of the MirrorSourceTask.
> The OffsetSyncStore attempts to keep a maximal number of distinct syncs 
> present, and for regularly spaced syncs it does not allow an incoming sync to 
> expire more than one other unique sync. There are tests to enforce this 
> property.
> For variable spaced syncs, there is no such guarantee, because multiple 
> fine-grained syncs may need to be expired at the same time. However, instead 
> of only those fine-grained syncs being expired, the store may also expire 
> coarser-grained syncs. This causes a large decrease in the number of unique 
> syncs.
> This is an extremely simple example: Syncs: 0 (start), 1, 2, 4.
> The result:
> {noformat}
> TRACE New sync OffsetSync{topicPartition=topic1-2, upstreamOffset=1, 
> downstreamOffset=1} applied, new state is [1:1,0:0] 
> (org.apache.kafka.connect.mirror.OffsetSyncStore:194)
> TRACE New sync OffsetSync{topicPartition=topic1-2, upstreamOffset=2, 
> downstreamOffset=2} applied, new state is [2:2,1:1,0:0] 
> (org.apache.kafka.connect.mirror.OffsetSyncStore:194)
> TRACE New sync OffsetSync{topicPartition=topic1-2, upstreamOffset=4, 
> downstreamOffset=4} applied, new state is [4:4,0:0] 
> (org.apache.kafka.connect.mirror.OffsetSyncStore:194){noformat}
> Instead of being expired, the 2:2 sync should still be present in the final 
> state, allowing the store to maintain 3 unique syncs.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to