[ 
https://issues.apache.org/jira/browse/KAFKA-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957169#comment-16957169
 ] 

Ning Zhang commented on KAFKA-9076:
-----------------------------------

[~ryannedolan] Thanks for your comments. Here is my response:

> in order to write offsets for a consumer group, we need to know that the 
> group is not already running on the target cluster. Otherwise we'd be 
> stepping on > that group's current offsets. The group coordinator won't allow 
> this afaik.

my PR will could check if the group is active or not in the target cluster. If 
active, no need to sync the offset

> we could kick out a target group and force it to seek to the new offsets by 
> revoking group membership and forcing a rebalance etc. But we wouldn't want > 
> to do this periodically.

For Kafka Stream, I am not sure if there is a way to force the stream 
application to seek from a particular offset, like `consumer.seek()`.

> we could write offsets to a new group ID, eg. us-west group1, just like we do 
> with topics, s.t. we avoid the above issues. Then migrating groups would  > 
> involve changing the group ID. That works fine, but consumers would need a 
> way to determine which group ID to use. Translating group ID like that is 
> more > cumbersome than translating offsets, since offsets can be altered 
> using existing tools, but there is no way to tell a consumer to change its 
> group ID.

When failover or migrate from one to another cluster, especially doing it 
manually, if this requires to change the broker URL of consumer / stream 
application to point to the backup cluster, it may not be cumbersome to change 
the consumer group ID as well.

>I think there are scenarios where automatically writing offsets as you propose 
>might make sense, e.g. in an active/standby scenario where consumers only > 
>connect to one cluster at a time. But if you are automating that behavior, you 
>might as well automate the offset translation via RemoteClusterUtils, IMO.

offset translation is the first step and can be certainly done by 
RemoteClusterUtils. The second step is to write the offsets to the target 
cluster. IMO, RemoteClusterUtils may not do the second step.

>My team has built external tooling using RemoteClusterUtils that works with 
>existing consumers. It's possible to fully automate failover and failback this 
>> way. I'm skeptical that automatically writing offsets as you propose would 
>make this process simpler.

I have a PR for review and use it internally in our system, which works well 
for active/standup and migrating stream applications from one cluster to 
another cluster transparently. Definitely more considerations on this approach 
could be needed in the future.

The PR is small and by default I propose to disable this auto sync feature for 
now. Do we want to see the PR and exchange more thoughts?

Thanks

 

 

> MirrorMaker 2.0 automated consumer offset sync
> ----------------------------------------------
>
>                 Key: KAFKA-9076
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9076
>             Project: Kafka
>          Issue Type: Improvement
>          Components: mirrormaker
>    Affects Versions: 2.4.0
>            Reporter: Ning Zhang
>            Priority: Major
>              Labels: mirrormaker, pull-request-available
>             Fix For: 2.5.0
>
>
> To calculate the translated consumer offset in the target cluster, currently 
> `Mirror-client` provides a function called "remoteConsumerOffsets()" that is 
> used by "RemoteClusterUtils" for one-time purpose.
> In order to make the consumer migration from source to target cluster 
> transparent and convenient, e.g. in event of source cluster failure, it is 
> better to have a background job to continuously and periodically sync the 
> consumer offsets from the source to target cluster, so that when the consumer 
> switches to the target cluster, it will resume to consume from where it left 
> off at source cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to