Hi
I have a large, stateful, KafkaStreams application that is on a never
ending rebalance loop.
I can see that Task restorations take a loooong time (circa 30-45 min). And
after that I see this error.
This is followed by tasks being suspended, and the instance re-joining the
group and a new rebalance is triggered.
Any ideas on how to fix this?

WARN org.apache.kafka.streams.processor.internals.StreamThread - stream-
thread [inventory-streams-green-0-StreamThread-1] Detected that the thread
is being fenced. This implies that this thread missed a rebalance and
dropped out of the consumer group. Will close out all assigned tasks and
rejoin the consumer group.
org.apache.kafka.streams.errors.TaskMigratedException: Consumer committing
offsets failed, indicating the corresponding thread is no longer part of
the group; it means all tasks belonging to this thread should be migrated.
at
org.apache.kafka.streams.processor.internals.TaskManager.commitOffsetsOrTransaction
(TaskManager.java:1141) ~[app.jar:?] at
org.apache.kafka.streams.processor.internals.TaskManager.handleRevocation(
TaskManager.java:541) ~[app.jar:?] at
org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsRevoked
(StreamsRebalanceListener.java:95) ~[app.jar:?] at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsRevoked
(ConsumerCoordinator.java:312) ~[app.jar:?] at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete
(ConsumerCoordinator.java:408) ~[app.jar:?] at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded
(AbstractCoordinator.java:449) ~[app.jar:?] at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup
(AbstractCoordinator.java:365) ~[app.jar:?] at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(
ConsumerCoordinator.java:508) ~[app.jar:?] at
org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded
(KafkaConsumer.java:1261) ~[app.jar:?] at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1230
) ~[app.jar:?] at org.apache.kafka.clients.consumer.KafkaConsumer.poll(
KafkaConsumer.java:1210) ~[app.jar:?] at
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(
StreamThread.java:925) ~[app.jar:?] at
org.apache.kafka.streams.processor.internals.StreamThread.pollPhase(
StreamThread.java:885) ~[app.jar:?] at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(
StreamThread.java:720) [app.jar:?] at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(
StreamThread.java:583) [app.jar:?] at
org.apache.kafka.streams.processor.internals.StreamThread.run(
StreamThread.java:556) [app.jar:?] Caused by:
org.apache.kafka.clients.consumer.CommitFailedException: Offset commit
cannot be completed since the consumer is not part of an active group for
auto partition assignment; it is likely that the consumer was kicked out of
the group. at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest
(ConsumerCoordinator.java:1139) ~[app.jar:?] at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync
(ConsumerCoordinator.java:1004) ~[app.jar:?] at
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(
KafkaConsumer.java:1490) ~[app.jar:?] at
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(
KafkaConsumer.java:1438) ~[app.jar:?] at
org.apache.kafka.streams.processor.internals.TaskManager.commitOffsetsOrTransaction
(TaskManager.java:1139) ~[app.jar:?] ... 15 more

Reply via email to