We are running an external (like in non supported) C++ client library agains 0.8.2-rc2 and see differences in the Isr vector in Metadata Response compared to what ./kafka-topics.sh --describe returns.
We have a triple replicated topic that is not updated during the test. kafka-topics.sh returns Topic: saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 2,1,3 Topic: saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3 After some debugging of the received packet it seems the data is actually missing from the server. After a sequensial restart of each broker - everything was back to normal two pairs of loglines every 10s initial state: saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, restart broker 1 handle_connect_retry_timer _connect_async_next z8r102-mc12-4-4.sth-tc2.videoplaza.net:9092 saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2, 3, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2, 3, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, ... saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2, 3, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2, 3, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 3, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 3, restart broker 3 known brokers changed {.... } saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 1, 2, Isr: 2, 1, known brokers changed { .... } saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 3, 1, 2, Isr: 2, 1, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 3, 1, 2, Isr: 2, 1, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 3, 1, 2, Isr: 2, 1, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2, 1, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2, 1, 3, restart broker 2 saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1, 3, 2, saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1, 3, 2, all this time kafka-topics.sh returns (except for a very short time during the restart) Topic: saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 2,1,3 Topic: saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3 This seems reproducible by shutting down all brokers at the same time. Then the isr vectors will never "heal". Bumping broker by broker heals them again. /svante /svante /svante