[ 
https://issues.apache.org/jira/browse/KAFKA-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955679#comment-14955679
 ] 

James Cheng commented on KAFKA-658:
-----------------------------------

I would also like this, so that consumers can transition from one cluster to 
another and be able to resume without missing or duplicating any records.


> Implement "Exact Mirroring" functionality in mirror maker
> ---------------------------------------------------------
>
>                 Key: KAFKA-658
>                 URL: https://issues.apache.org/jira/browse/KAFKA-658
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Jay Kreps
>              Labels: project
>
> There are two ways to implement "mirroring" (i.e. replicating a topic from 
> one cluster to another):
> 1. Do a simple read from the source and write to the destination with no 
> attempt to maintain the same partitioning or offsets in the destination 
> cluster. In this case the destination cluster may have a different number of 
> partitions, and you can even read from many clusters to create a merged 
> cluster. This flexibility is nice. The downside is that since the 
> partitioning and offsets are not the same a consumer of the source cluster 
> has no equivalent position in the destination cluster. This is the style of 
> mirroring we have implemented in the mirror-maker tool and use for datacenter 
> replication today.
> 2. The second style of replication only would allow creating an exact replica 
> of a source cluster (i.e. all partitions and offsets exactly the same). The 
> nice thing about this is that the offsets and partitions would match exactly. 
> The downside is that it is not possible to merge multiple source clusters 
> this way or have different partitioning. We do not currently support this in 
> mirror maker.
> It would be nice to implement the second style as an option in mirror maker 
> as having an exact replica would be a nice option to have in the case where 
> you are replicating a single cluster only.
> There are some nuances: In order to maintain the exact offsets it is 
> important to guarantee that the producer never resends a message or loses a 
> message. As a result it would be important to have only a single producer for 
> each destination partition, and check the last produced message on startup 
> (using the getOffsets api) so that in the case of a hard crash messages that 
> are re-consumed are not re-emitted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to