[ https://issues.apache.org/jira/browse/KAFKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074744#comment-14074744 ]
Oleg Lvovitch commented on KAFKA-1557: -------------------------------------- Thanks [~jjkoshy], yes, this does seem to be the same issue. A quick follow up before we can close this as a dup: - Is there any impact on the cluster health? After all, internal broker metadata is not correct - Is there an ETA on getting this fixed? > ISR reported by TopicMetadataResponse most of the time doesn't match the > Zookeeper information (and the truth) > -------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1557 > URL: https://issues.apache.org/jira/browse/KAFKA-1557 > Project: Kafka > Issue Type: Bug > Components: controller, core, replication > Affects Versions: 0.8.0, 0.8.1 > Environment: OSX 10.9.3, Linux Scientific 6.5 > It actually doesn't seem to matter and appears to be OS-agnostic > Reporter: Oleg Lvovitch > Assignee: Neha Narkhede > Labels: newbie++ > Fix For: 0.8.1.1, 0.8.2 > > Attachments: BrokenKafkaLink.scala, server1.properties, > server2.properties > > > TL;DR - after a topic is created, and at least one broker in the ISR is > restarted, the ISR reported by the TopicMetadataResponse is incorrect. > Specific steps to repro: > - Download 0.8.1 Kafka > - Copy server.properties twice into server1.properties and server2.properties > (attached) - basically just ports and log paths changed to allow brokers to > co-exist > - Start zookeper using "sh bin/zookeeper-server-start.sh > config/zookeper.properties" > - Start broker1: 'sh bin/kafka-server-start.sh config/server1.properties" > - Start broker2: 'sh bin/kafka-server-start.sh config/server2.properties" > - Create a new topic: "sh bin/kafka-topics.sh --zookeeper localhost:2181 > --create --topic test --replication-factor 2 --partitions 3" > - Examine topic state: "sh bin/kafka-topics.sh --zookeeper localhost:2181 > --describe --topic test" - note that all ISRs are of length 2 > - Run the attached Scala code that uses TopicMetadataRequest to exmaine topic > state. Observer that all ISRs are of length 2 and match the information > output by the script > - Shut down broker2 (simply hit Cntrl-C in the terminal), wait 5-10 seconds > - Restart broker 2 using the original command > - Check the status of the topic again. Observe that the leader for all topics > is 0 (as expected), and all ISRs contain both brokers (as expected) > - Run the attached Scala snippet again. > EXPECTED: > - The ISR information are of length 2 > ACTUAL: > - ALL ISRs contain just broker 0 > NOTE: depending on how long broker 2 was down, sometimes some ISRs will > contain the full list, but shutting it down for 15+ secs seem to always yield > consistent repro > Basically it appears that brokers have incorrect ISR information for the > metadata cache. > Our production servers exhibit the same problem - after a topic gets created > everything looks fine, but as brokers get restarted, ISR reported by the > brokers is wrong, whereas the one in ZK appears to report the truth (it > shrinks as brokers get shut down and grows back up after they get restarted) > I'm not sure if this has wider impact on the functioning of the cluster - bad > metadata information is bad - but so far there has been no evidence of that -- This message was sent by Atlassian JIRA (v6.2#6252)