Ramkumar created KAFKA-6745: ------------------------------- Summary: kafka consumer rebalancing takes long time (from 3 secs to 5 minutes) Key: KAFKA-6745 URL: https://issues.apache.org/jira/browse/KAFKA-6745 Project: Kafka Issue Type: Improvement Components: clients, core Affects Versions: 0.11.0.0 Reporter: Ramkumar
Hi, We had an HTTP service 3 nodes around Kafka 0.8 . This http service acts as a REST api for the publishers and consumers to use middleware intead of using kafka client api. Here the when the consumers rebalance is not a major issue. We wanted to upgrade to kafka 0.11 , we have updated our http services (3 node cluster) to use new Kafka consumer API , but it takes rebalancing of consumer (multiple consumer under same Group) between secs to 5 mins (max.poll.interval.ms). Because of this time our http clients are timing out and do failover. This rebalancing time is major issue. It is not clear from the documentation ,that rebalance activity for the group takes place after max.poll.interval.ms or it starts after 3 secs and complete any time with in 5 minutes. We tried to reduce max.poll.interval.ms to 15 seconds. but this also triggers rebalance internally. Below are the other parameters we have set In our service max.poll.interval.ms = 30 sec seconds heartbeat.interval.ms = 1 minute session.timeout.ms = 4 minutes consumer.cache.timeout = 2 min below is the log ""2018-03-26 12:53:23,009 [qtp1404928347-11556] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - (Re-)joining group firstnetportal_001 ""2018-03-26 12:57:52,793 [qtp1404928347-11556] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Successfully joined group firstnetportal_001 with generation 7475 Please let me know if there are any other application/client use http interace in 3 nodes with out any having this issue -- This message was sent by Atlassian JIRA (v7.6.3#76005)