[
https://issues.apache.org/jira/browse/KAFKA-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gustafson resolved KAFKA-9839.
------------------------------------
Fix Version/s: 2.5.1
Resolution: Fixed
> IllegalStateException on metadata update when broker learns about its new
> epoch after the controller
> ----------------------------------------------------------------------------------------------------
>
> Key: KAFKA-9839
> URL: https://issues.apache.org/jira/browse/KAFKA-9839
> Project: Kafka
> Issue Type: Bug
> Components: controller, core
> Affects Versions: 2.3.1
> Reporter: Anna Povzner
> Assignee: Anna Povzner
> Priority: Critical
> Fix For: 2.5.1
>
>
> Broker throws "java.lang.IllegalStateException: Epoch XXX larger than current
> broker epoch YYY" on UPDATE_METADATA when the controller learns about the
> broker epoch and sends UPDATE_METADATA before KafkaZkCLient.registerBroker
> completes (the broker learns about its new epoch).
> Here is the scenario we observed in more detail:
> 1. ZK session expires on broker 1
> 2. Broker 1 establishes new session to ZK and creates znode
> 3. Controller learns about broker 1 and assigns epoch
> 4. Broker 1 receives UPDATE_METADATA from controller, but it does not know
> about its new epoch yet, so we get an exception:
> ERROR [KafkaApi-3] Error when handling request: clientId=1, correlationId=0,
> api=UPDATE_METADATA, body={
> .........
> java.lang.IllegalStateException: Epoch XXX larger than current broker epoch
> YYY at kafka.server.KafkaApis.isBrokerEpochStale(KafkaApis.scala:2725) at
> kafka.server.KafkaApis.handleUpdateMetadataRequest(KafkaApis.scala:320) at
> kafka.server.KafkaApis.handle(KafkaApis.scala:139) at
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at
> java.lang.Thread.run(Thread.java:748)
> 5. KafkaZkCLient.registerBroker completes on broker 1: "INFO Stat of the
> created znode at /brokers/ids/1"
> The result is the broker has a stale metadata for some time.
> Possible solutions:
> 1. Broker returns a more specific error and controller retries UPDATE_MEDATA
> 2. Broker accepts UPDATE_METADATA with larger broker epoch.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)