If auto.offset.reset is set to smallest, it does not mean the consumer
will always consume from the smallest. It means that if no previous offset
commit is found for this consumer group, then it will consume from the
smallest. So for mirror maker, you probably want to always use the same
consumer group id. This could be configured in the consumer config file
you pass into mirror maker.
Another thing about duplicate messages is that if mirror maker is shutdown
cleanly, next time when you start it again with same consumer group id,
there should be no duplicates. But if mirror maker shutdown uncleanly(e.g.
By a kill -9), then next time it starts up you might still have duplicate
messages after the last committed offsets.

Jiangjie (Becket) Qin

On 3/7/15, 11:45 PM, "sunil kalva" <sambarc...@gmail.com> wrote:

>Qin
>Partition problem is solved by passing "--new.producer true" option in
>command line,  but adding auto.offset.rese=smallest config, every time i
>restart the Mirror tool it copies from starting ends up having lot of
>duplicate messages in destination cluster.
>Could you please tell me how do i configure to make sure that destination
>cluster is always insync with source cluster.
>
>SunilKalva
>
>On Sun, Mar 8, 2015 at 12:54 AM, Jiangjie Qin <j...@linkedin.com.invalid>
>wrote:
>
>> For data not showing up, you need to make sure mirror maker consumer
>> auto.offset.reset is set to smallest, otherwise when you run mirror
>>maker
>> for the first time, all the pre-existing messages won¹t be consumed.
>> For partition sticking, can you verify if your messages are keyed
>>messages
>> or not? If they are not keyed messages, can you check if you are using
>>old
>> producer or new producer? For old producer, the default behavior is
>> sticking to one partition for 10 min and then move to the next
>>partition.
>> So if you wait for more than 10 min, you should see messages in two
>> different partitions.
>>
>> Jiangjie (Becket) Qin
>>
>> On 3/7/15, 8:28 AM, "sunil kalva" <sambarc...@gmail.com> wrote:
>>
>> >And i also observed ,all the data is moving to one partition in
>> >destination
>> >cluster though i have multiple partitions for that topic in source and
>> >destination clusters.
>> >
>> >SunilKalva
>> >
>> >On Sat, Mar 7, 2015 at 9:54 PM, sunil kalva <sambarc...@gmail.com>
>>wrote:
>> >
>> >> I ran kafka mirroring tool after producing data in source cluster,
>>and
>> >> this is not copied to destination cluster. If i produce data after
>> >>running
>> >> tool those data are copied to destination cluster. Am i missing
>> >>something ?
>> >>
>> >> --
>> >> SunilKalva
>> >>
>> >
>> >
>> >
>> >--
>> >SunilKalva
>>
>>
>
>
>-- 
>SunilKalva

Reply via email to