RE: Kafka consumer getting duplicate message

2016-08-10 Thread Ghosh, Achintya (Contractor)
Can anyone please check this one?

Thanks
Achintya

-Original Message-
From: Ghosh, Achintya (Contractor) 
Sent: Monday, August 08, 2016 9:44 AM
To: us...@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: RE: Kafka consumer getting duplicate message

Thank you , Ewen for your response.
Actually we are using 1.0.0.M2 Spring Kafka release that uses Kafka 0.9 release.
Yes, we see a lot of duplicates and here is our producer and consumer settings 
in application. We don't see any duplicacy at Producer end I mean if we send 
1000 messages to a particular Topic we receive exactly (sometimes less) 1000 
messages.

But when we consume the message at Consumer level we see a lot of messages with 
same offset value and same partition , so please let us know what tweaking is 
needed to avaoid the duplicacy.

We have three types of Topics and each topic has 3 replication factors and 10 
partitions.

Producer Configuration:

bootstrap.producer.servers=provisioningservices-aq-dev.g.comcast.net:80
acks=1
retries=3
batch.size=16384
linger.ms=5
buffer.memory=33554432
request.timeout.ms=6
timeout.ms=6
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageSer

Consumer Configuration:

bootstrap.consumer.servers=provisioningservices-aqr-dev.g.comcast.net:80
group.id=ps-consumer-group
enable.auto.commit=false
auto.commit.interval.ms=100
session.timeout.ms=15000
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageDeSer

factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

Thanks
Achintya


-Original Message-
From: Ewen Cheslack-Postava [mailto:e...@confluent.io]
Sent: Saturday, August 06, 2016 1:45 AM
To: us...@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumer getting duplicate message

Achintya,

1.0.0.M2 is not an official release, so this version number is not particularly 
meaningful to people on this list. What platform/distribution are you using and 
how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once 
semantics because those semantics rely on the source and destination systems 
coordinating -- the source provides some sort of retry semantics, and the 
destination system needs to do some sort of deduplication or similar to only 
"deliver" the data one time.

That said, duplicates should usually only be generated in the face of failures. 
If you're seeing a lot of duplicates, that probably means shutdown/failover is 
not being handled correctly. If you can provide more info about your setup, we 
might be able to suggest tweaks that will avoid these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) < 
achintya_gh...@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate 
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



--
Thanks,
Ewen


RE: Kafka consumer getting duplicate message

2016-08-08 Thread Ghosh, Achintya (Contractor)
Thank you , Ewen for your response.
Actually we are using 1.0.0.M2 Spring Kafka release that uses Kafka 0.9 release.
Yes, we see a lot of duplicates and here is our producer and consumer settings 
in application. We don't see any duplicacy at Producer end I mean if we send 
1000 messages to a particular Topic we receive exactly (sometimes less) 1000 
messages.

But when we consume the message at Consumer level we see a lot of messages with 
same offset value and same partition , so please let us know what tweaking is 
needed to avaoid the duplicacy.

We have three types of Topics and each topic has 3 replication factors and 10 
partitions.

Producer Configuration:

bootstrap.producer.servers=provisioningservices-aq-dev.g.comcast.net:80
acks=1
retries=3
batch.size=16384
linger.ms=5
buffer.memory=33554432
request.timeout.ms=6
timeout.ms=6
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageSer

Consumer Configuration:

bootstrap.consumer.servers=provisioningservices-aqr-dev.g.comcast.net:80
group.id=ps-consumer-group
enable.auto.commit=false
auto.commit.interval.ms=100
session.timeout.ms=15000
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageDeSer

factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

Thanks
Achintya


-Original Message-
From: Ewen Cheslack-Postava [mailto:e...@confluent.io] 
Sent: Saturday, August 06, 2016 1:45 AM
To: us...@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumer getting duplicate message

Achintya,

1.0.0.M2 is not an official release, so this version number is not particularly 
meaningful to people on this list. What platform/distribution are you using and 
how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once 
semantics because those semantics rely on the source and destination systems 
coordinating -- the source provides some sort of retry semantics, and the 
destination system needs to do some sort of deduplication or similar to only 
"deliver" the data one time.

That said, duplicates should usually only be generated in the face of failures. 
If you're seeing a lot of duplicates, that probably means shutdown/failover is 
not being handled correctly. If you can provide more info about your setup, we 
might be able to suggest tweaks that will avoid these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) < 
achintya_gh...@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate 
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



--
Thanks,
Ewen


Re: Kafka consumer getting duplicate message

2016-08-05 Thread Ewen Cheslack-Postava
Achintya,

1.0.0.M2 is not an official release, so this version number is not
particularly meaningful to people on this list. What platform/distribution
are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once
semantics because those semantics rely on the source and destination
systems coordinating -- the source provides some sort of retry semantics,
and the destination system needs to do some sort of deduplication or
similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of
failures. If you're seeing a lot of duplicates, that probably means
shutdown/failover is not being handled correctly. If you can provide more
info about your setup, we might be able to suggest tweaks that will avoid
these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) <
achintya_gh...@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



-- 
Thanks,
Ewen