[ https://issues.apache.org/jira/browse/KAFKA-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guozhang Wang resolved KAFKA-6631. ---------------------------------- Resolution: Fixed Just a side note that we are working on KAFKA-7149 to reduce the assignment metadata size with many topic partitions in the assignment. > Kafka Streams - Rebalancing exception in Kafka 1.0.0 > ---------------------------------------------------- > > Key: KAFKA-6631 > URL: https://issues.apache.org/jira/browse/KAFKA-6631 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 1.0.0 > Environment: Container Linux by CoreOS 1576.5.0 > Reporter: Alexander Ivanichev > Priority: Critical > > > In Kafka Streams 1.0.0, we saw a strange rebalance error, our stream app > performs window based aggregations, sometimes on start when all stream > workers join the app just crash, however if we enable only one worker than > it works fine, sometime 2 workers work just fine, but when third join the app > crashes again, some critical issue with rebalance. > {code:java} > 018-03-08T18:51:01.226243000Z org.apache.kafka.common.KafkaException: > Unexpected error from SyncGroup: The server experienced an unexpected error > when processing the request > 2018-03-08T18:51:01.226557000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:566) > 2018-03-08T18:51:01.226860000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:539) > 2018-03-08T18:51:01.227328000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:808) > 2018-03-08T18:51:01.227630000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:788) > 2018-03-08T18:51:01.228152000Z at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204) > 2018-03-08T18:51:01.228449000Z at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167) > 2018-03-08T18:51:01.228897000Z at > org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127) > 2018-03-08T18:51:01.229196000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:506) > 2018-03-08T18:51:01.229673000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353) > 2018-03-08T18:51:01.229971000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268) > 2018-03-08T18:51:01.230436000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214) > 2018-03-08T18:51:01.230749000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:174) > 2018-03-08T18:51:01.231065000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:364) > 2018-03-08T18:51:01.231584000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316) > 2018-03-08T18:51:01.231911000Z at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295) > 2018-03-08T18:51:01.232190000Z at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138) > 2018-03-08T18:51:01.232643000Z at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103) > 2018-03-08T18:51:01.233121000Z at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851) > 2018-03-08T18:51:01.233409000Z at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808) > 2018-03-08T18:51:01.233720000Z at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774) > 2018-03-08T18:51:01.234196000Z at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744) > 2018-03-08T18:51:01.234655000Z org.apache.kafka.common.KafkaException: > Unexpected error from SyncGroup: The server experienced an unexpected error > when processing the request > 2018-03-08T18:51:01.234972000Z exception in thread, closing process > 2018-03-08T18:51:01.235500000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:566) > 2018-03-08T18:51:01.235839000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:539) > 2018-03-08T18:51:01.236336000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:808) > 2018-03-08T18:51:01.236603000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:788) > 2018-03-08T18:51:01.236889000Z at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204) > 2018-03-08T18:51:01.237092000Z at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167) > 2018-03-08T18:51:01.237531000Z at > org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127) > 2018-03-08T18:51:01.237816000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:506) > 2018-03-08T18:51:01.238097000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353) > 2018-03-08T18:51:01.238395000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268) > 2018-03-08T18:51:01.238698000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214) > 2018-03-08T18:51:01.239511000Z exception in thread, closing process > 2018-03-08T18:51:01.239880000Z exception in thread, closing process > 2018-03-08T18:51:01.240175000Z at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:174) > 2018-03-08T18:51:01.240443000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:364) > 2018-03-08T18:51:01.240764000Z at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316) > 2018-03-08T18:51:01.241083000Z at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295) > 2018-03-08T18:51:01.241367000Z at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138) > 2018-03-08T18:51:01.241789000Z at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103) > 2018-03-08T18:51:01.242075000Z at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851) > 2018-03-08T18:51:01.242351000Z at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808) > 2018-03-08T18:51:01.242641000Z at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774) > 2018-03-08T18:51:01.243051000Z at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744) > {code} > On Taking a look further on brokers, I saw another exception: > {code:java} > Appending metadata message for group AnomalyKafkaStreams generation 12 failed > due to org.apache.kafka.common.errors.RecordTooLargeException, returning > UNKNOWN error code to the client > (kafka.coordinator.group.GroupMetadataManager) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)