Hey Vincent. That's exactly how my code is. I am doing processing within that for loop.
In KIP-62 I read that heartbeat happens via a separate thread https://github.com/dpkp/kafka-python/issues/948. But you are saying it happens through polling. What can be considered true? I have set session.timeout.ms to 5 minutes. max.poll.records is set to 5. So even if my message takes 30 seconds to process, it still shouldn't cross this threshold. Yet I see frequent rebalances. Then there is max.poll.interval.ms too. Don't exactly know how it affects. But overall I am finding it very difficult to understand these myriads of settings, also documentation is not very clear. On Thu, May 24, 2018 at 8:09 PM Vincent Maurin <vincent.mau...@glispa.com> wrote: > Shantanu, I was more referering to you application code. > You should have something similar to : > > while (true) { > ConsumerRecords<String, String> records = consumer.poll(100); > for (ConsumerRecord<String, String> record : records) { > // Your logic > } > } > > You should make sure that the code within the loop doesn't take too much > time (more than session.timeout.ms) > From the consumer javadoc > "The consumer will automatically ping the cluster periodically, which lets > the cluster know that it is alive. Note that the consumer is > single-threaded, so periodic heartbeats can only be sent when poll(long) > < > https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#poll(long) > > > is called. As long as the consumer is able to do this it is considered > alive and retains the right to consume from the partitions assigned to it. > If it stops heartbeating by failing to call poll(long) > < > https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#poll(long) > > > for a period of time longer than session.timeout.ms then it will be > considered dead and its partitions will be assigned to another process." > > Best > > On Thu, May 24, 2018 at 4:07 PM Shantanu Deshmukh <shantanu...@gmail.com> > wrote: > > > Another observation is that when I restart my application. Consumption > > doesn't start till 5-6 minutes. In kafka consumer logs I see > > > > ConsumerCoordinator.333 - Revoking previously assigned partitions [] for > > group notifications-consumer > > AbstractCoordinator:381 - (Re-)joining group notifications-consumer > > > > Then nothing. After 5-6 minutes activities start. > > > > On Thu, May 24, 2018 at 6:49 PM Shantanu Deshmukh <shantanu...@gmail.com > > > > wrote: > > > > > Hi Vincent, > > > > > > Yes I reduced max.poll.records to get that same effect. I reduced it > all > > > the way down to 5 records still I am seeing same error. What else can > be > > > done? For one topic I can see that a single message processing is > taking > > > about 20 seconds. So 5 of them will take 1 minute. So I set > > > session.timeout.ms to 5 minutes, max.poll.interval.ms to 10 minutes. > But > > > it is not helping still. > > > > > > On Thu, May 24, 2018 at 6:15 PM Vincent Maurin < > > vincent.mau...@glispa.com> > > > wrote: > > > > > >> Hello Shantanu, > > >> > > >> It is also important to consider your consumer code. You should not > > spend > > >> to much time in between two calls to "poll" method. Otherwise, the > > >> consumer > > >> not calling poll will be considered dead by the group, triggering a > > >> rebalancing. > > >> > > >> Best > > >> > > >> On Thu, May 24, 2018 at 1:45 PM M. Manna <manme...@gmail.com> wrote: > > >> > > >> > Set your rebalance.backoff.ms=10000 and > zookeeper.session.timeout.ms > > >> =30000 > > >> > in addition to what Manikumar said. > > >> > > > >> > > > >> > > > >> > On 24 May 2018 at 12:41, Shantanu Deshmukh <shantanu...@gmail.com> > > >> wrote: > > >> > > > >> > > Hello, > > >> > > > > >> > > There was a type in my first mail. session.timeout.ms is actually > > >> 60000 > > >> > > not > > >> > > 6000. So it is less than heartbeat.interval.ms. > > >> > > > > >> > > On Thu, May 24, 2018 at 2:46 PM Manikumar < > > manikumar.re...@gmail.com> > > >> > > wrote: > > >> > > > > >> > > > heartbeat.interval.ms should be lower than session.timeout.ms. > > >> > > > > > >> > > > Check here: > > >> > > > > > http://kafka.apache.org/0101/documentation.html#newconsumerconfigs > > >> > > > > > >> > > > > > >> > > > On Thu, May 24, 2018 at 2:39 PM, Shantanu Deshmukh < > > >> > > shantanu...@gmail.com> > > >> > > > wrote: > > >> > > > > > >> > > > > Someone please help me. I am suffering due to this issue > since a > > >> long > > >> > > > time > > >> > > > > and not finding any solution. > > >> > > > > > > >> > > > > On Wed, May 23, 2018 at 3:48 PM Shantanu Deshmukh < > > >> > > shantanu...@gmail.com > > >> > > > > > > >> > > > > wrote: > > >> > > > > > > >> > > > > > We have a 3 broker Kafka 0.10.0.1 cluster. There we have 3 > > >> topics > > >> > > with > > >> > > > 10 > > >> > > > > > partitions each. We have an application which spawns threads > > as > > >> > > > > consumers. > > >> > > > > > We spawn 5 consumers for each topic. I am observing that > > >> consider > > >> > > group > > >> > > > > > randomly keeps rebalancing. Then many times we see logs > saying > > >> > > > "Revoking > > >> > > > > > partitions for". This happens almost every 10 minutes. > > >> Consumption > > >> > > > during > > >> > > > > > this time completely stops. > > >> > > > > > > > >> > > > > > I have applied this configuration > > >> > > > > > max.poll.records 20 > > >> > > > > > heartbeat.interval.ms 10000 > > >> > > > > > Session.timeout.ms 6000 > > >> > > > > > > > >> > > > > > Still this did not help. Strange thing is I observed > consumer > > >> > writing > > >> > > > > logs > > >> > > > > > saying "auto commit failed because poll() loop spent too > much > > >> time > > >> > > > > > processing records" even when there was no data in partition > > to > > >> > > > process. > > >> > > > > We > > >> > > > > > have polling interval of 500 ms, specified as argument in > > >> poll(). > > >> > > > > Initially > > >> > > > > > I had set same consumer group for all three topics' > consumers. > > >> > Then I > > >> > > > > > specified different CGs for different topics' consumers. > Even > > >> this > > >> > is > > >> > > > not > > >> > > > > > helping. > > >> > > > > > > > >> > > > > > I am trying to search over the web, checked my code, tried > > many > > >> > > > > > combinations of configuration but still no luck. Please help > > me. > > >> > > > > > > > >> > > > > > *Thanks & Regards,* > > >> > > > > > > > >> > > > > > *Shantanu Deshmukh* > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >