Re: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread SenthilKumar K
Hi, You can check Consumer Api
https://kafka.apache.org/10/javadoc/?org/apache/kafka/clients/consumer/KafkaConsumer.html
.
Refer : Manual Offset Control

--Senthil


On Sat, May 25, 2019, 9:53 AM ASHOK MACHERLA  wrote:

> Dear Hans
>
> Thanks for you reply
>
> As you said we are getting same issue, our consumers some times goes to
> rebalance mode, during this time customer getting duplicate emails.
>
> So, How to set manual commit offsets??
>
> Is there any parameters to add for that.
> Please reply this email
>
> Sent from Outlook
>


Re: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread ASHOK MACHERLA
Dear Hans

Thanks for you reply

As you said we are getting same issue, our consumers some times goes to 
rebalance mode, during this time customer getting duplicate emails.

So, How to set manual commit offsets??

Is there any parameters to add for that.
Please reply this email

Sent from Outlook


Re: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread Hans Jespersen
Its not just the config, you need to change your code. 

kafka.auto.commit.interval.ms=3000 means that consumers only commit offsets 
every 3 seconds so if there is any failure or rebalance they will reconsume up 
to 3 seconds of  data per partition. That could be many hundreds or thousands 
of messages.

I would recommend you not use auto commit at all and instead manually commit 
offsets immediately after sending each email or batch of emails.

-hans

> On May 24, 2019, at 4:35 AM, ASHOK MACHERLA  wrote:
> 
> Dear Team
> 
> 
> 
> First of all thanks for reply on this issue.
> 
> 
> 
> Right now we are using these configurations at consumer side
> 
> 
> 
> kafka.max.poll.records=20
> 
> max.push.batch.size=100
> 
> enable.auto.commit=true
> 
> auto.offset.reset=latest
> 
> kafka.auto.commit.interval.ms=3000
> 
> kafka.session.timeout.ms=1
> 
> kafka.request.timeout.ms=3000
> 
> kafka.heartbeat.interval.ms=3000
> 
> kafka.max.poll.interval.ms=30
> 
> 
> 
> 
> 
> can you please suggest us to change the above config parameters .
> 
> 
> 
> 
> 
> We are using one Kafka topic with 10 partitions and 10 consumers, we are 
> sending lakhs of emails to the customers ,
> 
> It’s enough that much partitions and consumer ??
> 
> 
> 
> Otherwise I have to increase that partitions and consumers ??
> 
> 
> 
> Please suggest  us ..
> 
> 
> 
> 
> 
> in consumer logs , its showing
> 
> consumer group is rebalancing before committed because already group is 
> rebalancing
> 
> 
> 
> Sent from Outlook.
> 
> 
> 
> 
> From: Vincent Maurin 
> Sent: Friday, May 24, 2019 3:51:23 PM
> To: users@kafka.apache.org
> Subject: Re: Customers are getting same emails for roughly 30-40 times
> 
> It also seems you are using "at least one" strategy (maybe with
> auto-commit, or commiting after sending the email)
> Maybe a "at most once" could be a valid business strategy here ?
> 
> - at least once (you will deliver all the emails, but you could deliver
> duplicates)
> consumeMessages
> sendEmails
> commitOffsets
> 
> - at most once (you will never deliver duplicates, but you might never
> deliver an given email)
> consumeMessages
> commitOffsets
> sendEmails
> 
> Ideally, you could do "exactly once", but it is hard to achieve in the
> scenario, Kafka -> External system. The usual strategy here is to have an
> idempotent operation in combination with a "at least once" strategy
> 
> Best,
> Vincent
> 
> On Fri, May 24, 2019 at 10:39 AM Liam Clarke 
> wrote:
> 
>> Consumers will rebalance if you add partitions, add consumers to the group
>> or if a consumer leaves the group.
>> 
>> Consumers will leave the group after not communicating with the server for
>> a period set by session.timeout.ms. This is usually due to an exception in
>> the code polling with the consumer, or message processing code taking too
>> long.
>> 
>> If your consumers are reprocessing messages thus causing emails to send, it
>> implies that they weren't able to commit their offsets before
>> failing/timing out.
>> 
>> We had a similar issue in a database sink that consumed from Kafka and
>> duplicated data because it took too long, and hit the session timeout, and
>> then wasn't able to commits its offsets.
>> 
>> So I'd look closely at your consuming code and log every possible source of
>> exceptions.
>> 
>> Kind regards,
>> 
>> Liam Clarke
>> 
>>> On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA,  wrote:
>>> 
>>> Dear Team Member
>>> 
>>> Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our
>>> project we have to send bulk emails to customers for this purpose we are
>>> using Kafka cluster setup.
>>> 
>>> But customers are getting same emails for roughly 30-40 times. This is
>>> very worst thing. In this situation our consumer group is showing
>>> rebalancing. Might be its could be reason ?
>>> Currently one topic we are using for this. We have 10 partitions and 10
>>> consumers.
>>> I hope we have enough partitions and consumer as well.
>>> But I don’t know exactly number of partitions & consumer are required to
>>> overcome this issue.
>>> 
>>> Can you please suggest us to fix this issue.
>>> 
>>> If anything changes required in Kafka side as well as consumer side??
>>> How to stop rebalancing issue??
>>> Please suggest us, Thanks
>>> 
>>> 
>>> 
>>> Sent from Outlook.
>>> 
>>> 
>> 


RE: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread ASHOK MACHERLA
Dear Team



First of all thanks for reply on this issue.



Right now we are using these configurations at consumer side



kafka.max.poll.records=20

max.push.batch.size=100

enable.auto.commit=true

auto.offset.reset=latest

kafka.auto.commit.interval.ms=3000

kafka.session.timeout.ms=1

kafka.request.timeout.ms=3000

kafka.heartbeat.interval.ms=3000

kafka.max.poll.interval.ms=30





can you please suggest us to change the above config parameters .





We are using one Kafka topic with 10 partitions and 10 consumers, we are 
sending lakhs of emails to the customers ,

It’s enough that much partitions and consumer ??



Otherwise I have to increase that partitions and consumers ??



Please suggest  us ..





in consumer logs , its showing

consumer group is rebalancing before committed because already group is 
rebalancing



Sent from Outlook.




From: Vincent Maurin 
Sent: Friday, May 24, 2019 3:51:23 PM
To: users@kafka.apache.org
Subject: Re: Customers are getting same emails for roughly 30-40 times

It also seems you are using "at least one" strategy (maybe with
auto-commit, or commiting after sending the email)
Maybe a "at most once" could be a valid business strategy here ?

- at least once (you will deliver all the emails, but you could deliver
duplicates)
consumeMessages
sendEmails
commitOffsets

- at most once (you will never deliver duplicates, but you might never
deliver an given email)
consumeMessages
commitOffsets
sendEmails

Ideally, you could do "exactly once", but it is hard to achieve in the
scenario, Kafka -> External system. The usual strategy here is to have an
idempotent operation in combination with a "at least once" strategy

Best,
Vincent

On Fri, May 24, 2019 at 10:39 AM Liam Clarke 
wrote:

> Consumers will rebalance if you add partitions, add consumers to the group
> or if a consumer leaves the group.
>
> Consumers will leave the group after not communicating with the server for
> a period set by session.timeout.ms. This is usually due to an exception in
> the code polling with the consumer, or message processing code taking too
> long.
>
> If your consumers are reprocessing messages thus causing emails to send, it
> implies that they weren't able to commit their offsets before
> failing/timing out.
>
> We had a similar issue in a database sink that consumed from Kafka and
> duplicated data because it took too long, and hit the session timeout, and
> then wasn't able to commits its offsets.
>
> So I'd look closely at your consuming code and log every possible source of
> exceptions.
>
> Kind regards,
>
> Liam Clarke
>
> On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA,  wrote:
>
> > Dear Team Member
> >
> > Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our
> > project we have to send bulk emails to customers for this purpose we are
> > using Kafka cluster setup.
> >
> > But customers are getting same emails for roughly 30-40 times. This is
> > very worst thing. In this situation our consumer group is showing
> > rebalancing. Might be its could be reason ?
> > Currently one topic we are using for this. We have 10 partitions and 10
> > consumers.
> > I hope we have enough partitions and consumer as well.
> > But I don’t know exactly number of partitions & consumer are required to
> > overcome this issue.
> >
> > Can you please suggest us to fix this issue.
> >
> > If anything changes required in Kafka side as well as consumer side??
> > How to stop rebalancing issue??
> > Please suggest us, Thanks
> >
> >
> >
> > Sent from Outlook.
> >
> >
>


Re: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread Vincent Maurin
It also seems you are using "at least one" strategy (maybe with
auto-commit, or commiting after sending the email)
Maybe a "at most once" could be a valid business strategy here ?

- at least once (you will deliver all the emails, but you could deliver
duplicates)
consumeMessages
sendEmails
commitOffsets

- at most once (you will never deliver duplicates, but you might never
deliver an given email)
consumeMessages
commitOffsets
sendEmails

Ideally, you could do "exactly once", but it is hard to achieve in the
scenario, Kafka -> External system. The usual strategy here is to have an
idempotent operation in combination with a "at least once" strategy

Best,
Vincent

On Fri, May 24, 2019 at 10:39 AM Liam Clarke 
wrote:

> Consumers will rebalance if you add partitions, add consumers to the group
> or if a consumer leaves the group.
>
> Consumers will leave the group after not communicating with the server for
> a period set by session.timeout.ms. This is usually due to an exception in
> the code polling with the consumer, or message processing code taking too
> long.
>
> If your consumers are reprocessing messages thus causing emails to send, it
> implies that they weren't able to commit their offsets before
> failing/timing out.
>
> We had a similar issue in a database sink that consumed from Kafka and
> duplicated data because it took too long, and hit the session timeout, and
> then wasn't able to commits its offsets.
>
> So I'd look closely at your consuming code and log every possible source of
> exceptions.
>
> Kind regards,
>
> Liam Clarke
>
> On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA,  wrote:
>
> > Dear Team Member
> >
> > Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our
> > project we have to send bulk emails to customers for this purpose we are
> > using Kafka cluster setup.
> >
> > But customers are getting same emails for roughly 30-40 times. This is
> > very worst thing. In this situation our consumer group is showing
> > rebalancing. Might be its could be reason ?
> > Currently one topic we are using for this. We have 10 partitions and 10
> > consumers.
> > I hope we have enough partitions and consumer as well.
> > But I don’t know exactly number of partitions & consumer are required to
> > overcome this issue.
> >
> > Can you please suggest us to fix this issue.
> >
> > If anything changes required in Kafka side as well as consumer side??
> > How to stop rebalancing issue??
> > Please suggest us, Thanks
> >
> >
> >
> > Sent from Outlook.
> >
> >
>


Re: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread Liam Clarke
Consumers will rebalance if you add partitions, add consumers to the group
or if a consumer leaves the group.

Consumers will leave the group after not communicating with the server for
a period set by session.timeout.ms. This is usually due to an exception in
the code polling with the consumer, or message processing code taking too
long.

If your consumers are reprocessing messages thus causing emails to send, it
implies that they weren't able to commit their offsets before
failing/timing out.

We had a similar issue in a database sink that consumed from Kafka and
duplicated data because it took too long, and hit the session timeout, and
then wasn't able to commits its offsets.

So I'd look closely at your consuming code and log every possible source of
exceptions.

Kind regards,

Liam Clarke

On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA,  wrote:

> Dear Team Member
>
> Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our
> project we have to send bulk emails to customers for this purpose we are
> using Kafka cluster setup.
>
> But customers are getting same emails for roughly 30-40 times. This is
> very worst thing. In this situation our consumer group is showing
> rebalancing. Might be its could be reason ?
> Currently one topic we are using for this. We have 10 partitions and 10
> consumers.
> I hope we have enough partitions and consumer as well.
> But I don’t know exactly number of partitions & consumer are required to
> overcome this issue.
>
> Can you please suggest us to fix this issue.
>
> If anything changes required in Kafka side as well as consumer side??
> How to stop rebalancing issue??
> Please suggest us, Thanks
>
>
>
> Sent from Outlook.
>
>


Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread ASHOK MACHERLA
Dear Team Member

Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our project 
we have to send bulk emails to customers for this purpose we are using Kafka 
cluster setup.

But customers are getting same emails for roughly 30-40 times. This is very 
worst thing. In this situation our consumer group is showing rebalancing. Might 
be its could be reason ?
Currently one topic we are using for this. We have 10 partitions and 10 
consumers.
I hope we have enough partitions and consumer as well.
But I don’t know exactly number of partitions & consumer are required to 
overcome this issue.

Can you please suggest us to fix this issue.

If anything changes required in Kafka side as well as consumer side??
How to stop rebalancing issue??
Please suggest us, Thanks



Sent from Outlook.