Re: Customers are getting same emails for roughly 30-40 times
Hi, You can check Consumer Api https://kafka.apache.org/10/javadoc/?org/apache/kafka/clients/consumer/KafkaConsumer.html . Refer : Manual Offset Control --Senthil On Sat, May 25, 2019, 9:53 AM ASHOK MACHERLA wrote: > Dear Hans > > Thanks for you reply > > As you said we are getting same issue, our consumers some times goes to > rebalance mode, during this time customer getting duplicate emails. > > So, How to set manual commit offsets?? > > Is there any parameters to add for that. > Please reply this email > > Sent from Outlook >
Re: Customers are getting same emails for roughly 30-40 times
Dear Hans Thanks for you reply As you said we are getting same issue, our consumers some times goes to rebalance mode, during this time customer getting duplicate emails. So, How to set manual commit offsets?? Is there any parameters to add for that. Please reply this email Sent from Outlook
Re: Customers are getting same emails for roughly 30-40 times
Its not just the config, you need to change your code. kafka.auto.commit.interval.ms=3000 means that consumers only commit offsets every 3 seconds so if there is any failure or rebalance they will reconsume up to 3 seconds of data per partition. That could be many hundreds or thousands of messages. I would recommend you not use auto commit at all and instead manually commit offsets immediately after sending each email or batch of emails. -hans > On May 24, 2019, at 4:35 AM, ASHOK MACHERLA wrote: > > Dear Team > > > > First of all thanks for reply on this issue. > > > > Right now we are using these configurations at consumer side > > > > kafka.max.poll.records=20 > > max.push.batch.size=100 > > enable.auto.commit=true > > auto.offset.reset=latest > > kafka.auto.commit.interval.ms=3000 > > kafka.session.timeout.ms=1 > > kafka.request.timeout.ms=3000 > > kafka.heartbeat.interval.ms=3000 > > kafka.max.poll.interval.ms=30 > > > > > > can you please suggest us to change the above config parameters . > > > > > > We are using one Kafka topic with 10 partitions and 10 consumers, we are > sending lakhs of emails to the customers , > > It’s enough that much partitions and consumer ?? > > > > Otherwise I have to increase that partitions and consumers ?? > > > > Please suggest us .. > > > > > > in consumer logs , its showing > > consumer group is rebalancing before committed because already group is > rebalancing > > > > Sent from Outlook. > > > > > From: Vincent Maurin > Sent: Friday, May 24, 2019 3:51:23 PM > To: users@kafka.apache.org > Subject: Re: Customers are getting same emails for roughly 30-40 times > > It also seems you are using "at least one" strategy (maybe with > auto-commit, or commiting after sending the email) > Maybe a "at most once" could be a valid business strategy here ? > > - at least once (you will deliver all the emails, but you could deliver > duplicates) > consumeMessages > sendEmails > commitOffsets > > - at most once (you will never deliver duplicates, but you might never > deliver an given email) > consumeMessages > commitOffsets > sendEmails > > Ideally, you could do "exactly once", but it is hard to achieve in the > scenario, Kafka -> External system. The usual strategy here is to have an > idempotent operation in combination with a "at least once" strategy > > Best, > Vincent > > On Fri, May 24, 2019 at 10:39 AM Liam Clarke > wrote: > >> Consumers will rebalance if you add partitions, add consumers to the group >> or if a consumer leaves the group. >> >> Consumers will leave the group after not communicating with the server for >> a period set by session.timeout.ms. This is usually due to an exception in >> the code polling with the consumer, or message processing code taking too >> long. >> >> If your consumers are reprocessing messages thus causing emails to send, it >> implies that they weren't able to commit their offsets before >> failing/timing out. >> >> We had a similar issue in a database sink that consumed from Kafka and >> duplicated data because it took too long, and hit the session timeout, and >> then wasn't able to commits its offsets. >> >> So I'd look closely at your consuming code and log every possible source of >> exceptions. >> >> Kind regards, >> >> Liam Clarke >> >>> On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA, wrote: >>> >>> Dear Team Member >>> >>> Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our >>> project we have to send bulk emails to customers for this purpose we are >>> using Kafka cluster setup. >>> >>> But customers are getting same emails for roughly 30-40 times. This is >>> very worst thing. In this situation our consumer group is showing >>> rebalancing. Might be its could be reason ? >>> Currently one topic we are using for this. We have 10 partitions and 10 >>> consumers. >>> I hope we have enough partitions and consumer as well. >>> But I don’t know exactly number of partitions & consumer are required to >>> overcome this issue. >>> >>> Can you please suggest us to fix this issue. >>> >>> If anything changes required in Kafka side as well as consumer side?? >>> How to stop rebalancing issue?? >>> Please suggest us, Thanks >>> >>> >>> >>> Sent from Outlook. >>> >>> >>
RE: Customers are getting same emails for roughly 30-40 times
Dear Team First of all thanks for reply on this issue. Right now we are using these configurations at consumer side kafka.max.poll.records=20 max.push.batch.size=100 enable.auto.commit=true auto.offset.reset=latest kafka.auto.commit.interval.ms=3000 kafka.session.timeout.ms=1 kafka.request.timeout.ms=3000 kafka.heartbeat.interval.ms=3000 kafka.max.poll.interval.ms=30 can you please suggest us to change the above config parameters . We are using one Kafka topic with 10 partitions and 10 consumers, we are sending lakhs of emails to the customers , It’s enough that much partitions and consumer ?? Otherwise I have to increase that partitions and consumers ?? Please suggest us .. in consumer logs , its showing consumer group is rebalancing before committed because already group is rebalancing Sent from Outlook. From: Vincent Maurin Sent: Friday, May 24, 2019 3:51:23 PM To: users@kafka.apache.org Subject: Re: Customers are getting same emails for roughly 30-40 times It also seems you are using "at least one" strategy (maybe with auto-commit, or commiting after sending the email) Maybe a "at most once" could be a valid business strategy here ? - at least once (you will deliver all the emails, but you could deliver duplicates) consumeMessages sendEmails commitOffsets - at most once (you will never deliver duplicates, but you might never deliver an given email) consumeMessages commitOffsets sendEmails Ideally, you could do "exactly once", but it is hard to achieve in the scenario, Kafka -> External system. The usual strategy here is to have an idempotent operation in combination with a "at least once" strategy Best, Vincent On Fri, May 24, 2019 at 10:39 AM Liam Clarke wrote: > Consumers will rebalance if you add partitions, add consumers to the group > or if a consumer leaves the group. > > Consumers will leave the group after not communicating with the server for > a period set by session.timeout.ms. This is usually due to an exception in > the code polling with the consumer, or message processing code taking too > long. > > If your consumers are reprocessing messages thus causing emails to send, it > implies that they weren't able to commit their offsets before > failing/timing out. > > We had a similar issue in a database sink that consumed from Kafka and > duplicated data because it took too long, and hit the session timeout, and > then wasn't able to commits its offsets. > > So I'd look closely at your consuming code and log every possible source of > exceptions. > > Kind regards, > > Liam Clarke > > On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA, wrote: > > > Dear Team Member > > > > Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our > > project we have to send bulk emails to customers for this purpose we are > > using Kafka cluster setup. > > > > But customers are getting same emails for roughly 30-40 times. This is > > very worst thing. In this situation our consumer group is showing > > rebalancing. Might be its could be reason ? > > Currently one topic we are using for this. We have 10 partitions and 10 > > consumers. > > I hope we have enough partitions and consumer as well. > > But I don’t know exactly number of partitions & consumer are required to > > overcome this issue. > > > > Can you please suggest us to fix this issue. > > > > If anything changes required in Kafka side as well as consumer side?? > > How to stop rebalancing issue?? > > Please suggest us, Thanks > > > > > > > > Sent from Outlook. > > > > >
Re: Customers are getting same emails for roughly 30-40 times
It also seems you are using "at least one" strategy (maybe with auto-commit, or commiting after sending the email) Maybe a "at most once" could be a valid business strategy here ? - at least once (you will deliver all the emails, but you could deliver duplicates) consumeMessages sendEmails commitOffsets - at most once (you will never deliver duplicates, but you might never deliver an given email) consumeMessages commitOffsets sendEmails Ideally, you could do "exactly once", but it is hard to achieve in the scenario, Kafka -> External system. The usual strategy here is to have an idempotent operation in combination with a "at least once" strategy Best, Vincent On Fri, May 24, 2019 at 10:39 AM Liam Clarke wrote: > Consumers will rebalance if you add partitions, add consumers to the group > or if a consumer leaves the group. > > Consumers will leave the group after not communicating with the server for > a period set by session.timeout.ms. This is usually due to an exception in > the code polling with the consumer, or message processing code taking too > long. > > If your consumers are reprocessing messages thus causing emails to send, it > implies that they weren't able to commit their offsets before > failing/timing out. > > We had a similar issue in a database sink that consumed from Kafka and > duplicated data because it took too long, and hit the session timeout, and > then wasn't able to commits its offsets. > > So I'd look closely at your consuming code and log every possible source of > exceptions. > > Kind regards, > > Liam Clarke > > On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA, wrote: > > > Dear Team Member > > > > Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our > > project we have to send bulk emails to customers for this purpose we are > > using Kafka cluster setup. > > > > But customers are getting same emails for roughly 30-40 times. This is > > very worst thing. In this situation our consumer group is showing > > rebalancing. Might be its could be reason ? > > Currently one topic we are using for this. We have 10 partitions and 10 > > consumers. > > I hope we have enough partitions and consumer as well. > > But I don’t know exactly number of partitions & consumer are required to > > overcome this issue. > > > > Can you please suggest us to fix this issue. > > > > If anything changes required in Kafka side as well as consumer side?? > > How to stop rebalancing issue?? > > Please suggest us, Thanks > > > > > > > > Sent from Outlook. > > > > >
Re: Customers are getting same emails for roughly 30-40 times
Consumers will rebalance if you add partitions, add consumers to the group or if a consumer leaves the group. Consumers will leave the group after not communicating with the server for a period set by session.timeout.ms. This is usually due to an exception in the code polling with the consumer, or message processing code taking too long. If your consumers are reprocessing messages thus causing emails to send, it implies that they weren't able to commit their offsets before failing/timing out. We had a similar issue in a database sink that consumed from Kafka and duplicated data because it took too long, and hit the session timeout, and then wasn't able to commits its offsets. So I'd look closely at your consuming code and log every possible source of exceptions. Kind regards, Liam Clarke On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA, wrote: > Dear Team Member > > Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our > project we have to send bulk emails to customers for this purpose we are > using Kafka cluster setup. > > But customers are getting same emails for roughly 30-40 times. This is > very worst thing. In this situation our consumer group is showing > rebalancing. Might be its could be reason ? > Currently one topic we are using for this. We have 10 partitions and 10 > consumers. > I hope we have enough partitions and consumer as well. > But I don’t know exactly number of partitions & consumer are required to > overcome this issue. > > Can you please suggest us to fix this issue. > > If anything changes required in Kafka side as well as consumer side?? > How to stop rebalancing issue?? > Please suggest us, Thanks > > > > Sent from Outlook. > >
Customers are getting same emails for roughly 30-40 times
Dear Team Member Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our project we have to send bulk emails to customers for this purpose we are using Kafka cluster setup. But customers are getting same emails for roughly 30-40 times. This is very worst thing. In this situation our consumer group is showing rebalancing. Might be its could be reason ? Currently one topic we are using for this. We have 10 partitions and 10 consumers. I hope we have enough partitions and consumer as well. But I don’t know exactly number of partitions & consumer are required to overcome this issue. Can you please suggest us to fix this issue. If anything changes required in Kafka side as well as consumer side?? How to stop rebalancing issue?? Please suggest us, Thanks Sent from Outlook.