[ https://issues.apache.org/jira/browse/KAFKA-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955679#comment-14955679 ]
James Cheng commented on KAFKA-658: ----------------------------------- I would also like this, so that consumers can transition from one cluster to another and be able to resume without missing or duplicating any records. > Implement "Exact Mirroring" functionality in mirror maker > --------------------------------------------------------- > > Key: KAFKA-658 > URL: https://issues.apache.org/jira/browse/KAFKA-658 > Project: Kafka > Issue Type: New Feature > Reporter: Jay Kreps > Labels: project > > There are two ways to implement "mirroring" (i.e. replicating a topic from > one cluster to another): > 1. Do a simple read from the source and write to the destination with no > attempt to maintain the same partitioning or offsets in the destination > cluster. In this case the destination cluster may have a different number of > partitions, and you can even read from many clusters to create a merged > cluster. This flexibility is nice. The downside is that since the > partitioning and offsets are not the same a consumer of the source cluster > has no equivalent position in the destination cluster. This is the style of > mirroring we have implemented in the mirror-maker tool and use for datacenter > replication today. > 2. The second style of replication only would allow creating an exact replica > of a source cluster (i.e. all partitions and offsets exactly the same). The > nice thing about this is that the offsets and partitions would match exactly. > The downside is that it is not possible to merge multiple source clusters > this way or have different partitioning. We do not currently support this in > mirror maker. > It would be nice to implement the second style as an option in mirror maker > as having an exact replica would be a nice option to have in the case where > you are replicating a single cluster only. > There are some nuances: In order to maintain the exact offsets it is > important to guarantee that the producer never resends a message or loses a > message. As a result it would be important to have only a single producer for > each destination partition, and check the last produced message on startup > (using the getOffsets api) so that in the case of a hard crash messages that > are re-consumed are not re-emitted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)