Andrew Olson created KAFKA-4599: ----------------------------------- Summary: KafkaConsumer encounters SchemaException when Kafka broker stopped Key: KAFKA-4599 URL: https://issues.apache.org/jira/browse/KAFKA-4599 Project: Kafka Issue Type: Bug Components: consumer Reporter: Andrew Olson
We recently observed an issue in production that can apparently occur a small percentage of the time when a Kafka broker is stopped. We're using version 0.9.0.1 for all brokers and clients. During a recent episode, 3 KafkaConsumer instances (out of approximately 100) ran into the following SchemaException within a few seconds of instructing the broker to shutdown. {noformat} 2017-01-04 14:46:19 org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'responses': Error reading array of size 2774863, only 62 bytes available at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:71) at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:439) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:193) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:908) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:853) {noformat} The exception message was slightly different for one consumer, {{Error reading field 'responses': Error reading array of size 2774863, only 260 bytes available}} The exception was not caught and caused the Storm Executor thread to restart, so it's not clear if it would have been transient or fatal for the KafkaConsumer. Here are the initial broker shutdown logs, {noformat} 2017-01-04 14:46:15,869 INFO kafka.server.KafkaServer: [Kafka Server 4], shutting down 2017-01-04 14:46:16,298 INFO kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-40], Shutting down 2017-01-04 14:46:18,364 INFO kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-40], Stopped 2017-01-04 14:46:18,364 INFO kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-1-40], Shutdown completed 2017-01-04 14:46:18,612 INFO kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-3-30], Shutting down 2017-01-04 14:46:19,547 INFO kafka.server.KafkaServer: [Kafka Server 4], Controlled shutdown succeeded 2017-01-04 14:46:19,554 INFO kafka.network.SocketServer: [Socket Server on Broker 4], Shutting down 2017-01-04 14:46:19,593 INFO kafka.network.SocketServer: [Socket Server on Broker 4], Shutdown completed {noformat} We've found one very similar reported occurrence, http://mail-archives.apache.org/mod_mbox/kafka-users/201605.mbox/%3CCAGnq0kFPm%2Bd0Xdm4tY_O7MnV3_LqLU10uDhPwxzv-T7UnHy08g%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)