I was going to ask you to do that :) As long as >1 replicas are in-sync Kafka handles this nicely for recreating everything in the restarted broker.
I am curious, do you remember manipulating something before all these started? e.g. Using some topic admin command (or something along the line) ? On Tue, 1 Oct 2019 at 02:00, Sebastian Schmitz < sebastian.schm...@propellerhead.co.nz> wrote: > I deleted the topic now and with topic-auto-create enabled it was > immediately recreated and all is in sync again. > > Will keep and eye on this to see if it happens again.... > > > On 30-Sep-19 3:12 PM, Sebastian Schmitz wrote: > > Hello again, > > > > after like 15 minutes I have now this result: > > > > root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# > > ./kafka-reassign-partitions.sh --bootstrap-server localhost:9092 > > --zookeeper node1:2181 --reassignment-json-file move2.json --verify > > Status of partition reassignment: > > Reassignment of partition my_topic-7 completed successfully > > Reassignment of partition my_topic-14 completed successfully > > Reassignment of partition my_topic-8 completed successfully > > Reassignment of partition my_topic-4 completed successfully > > Reassignment of partition my_topic-3 completed successfully > > Reassignment of partition my_topic-13 completed successfully > > Reassignment of partition my_topic-1 completed successfully > > Reassignment of partition my_topic-15 completed successfully > > Reassignment of partition my_topic-6 completed successfully > > Reassignment of partition my_topic-11 completed successfully > > Reassignment of partition my_topic-0 completed successfully > > Reassignment of partition my_topic-12 completed successfully > > Reassignment of partition my_topic-10 completed successfully > > Reassignment of partition my_topic-2 completed successfully > > Reassignment of partition my_topic-9 completed successfully > > Reassignment of partition my_topic-5 completed successfully > > > > root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# ./kafka-topics.sh > > --bootstrap-server localhost:9092 --topic my_topic --describe > > Topic:my_topic PartitionCount:16 ReplicationFactor:3 > > > Configs:segment.bytes=1073741824,message.format.version=2.3-IV1,retention.bytes=1073741824 > > Topic: my_topic Partition: 0 Leader: 1 Replicas: > > 2,3,1 Isr: 1 > > Topic: my_topic Partition: 1 Leader: 1 Replicas: > > 3,1,2 Isr: 1 > > Topic: my_topic Partition: 2 Leader: 1 Replicas: > > 1,2,3 Isr: 1 > > Topic: my_topic Partition: 3 Leader: 1 Replicas: > > 2,1,3 Isr: 1 > > Topic: my_topic Partition: 4 Leader: 1 Replicas: > > 3,2,1 Isr: 1 > > Topic: my_topic Partition: 5 Leader: 1 Replicas: > > 1,3,2 Isr: 1 > > Topic: my_topic Partition: 6 Leader: 1 Replicas: > > 2,3,1 Isr: 1 > > Topic: my_topic Partition: 7 Leader: 1 Replicas: > > 3,1,2 Isr: 1 > > Topic: my_topic Partition: 8 Leader: 1 Replicas: > > 1,2,3 Isr: 1 > > Topic: my_topic Partition: 9 Leader: 1 Replicas: > > 2,1,3 Isr: 1 > > Topic: my_topic Partition: 10 Leader: 1 Replicas: > > 3,2,1 Isr: 1 > > Topic: my_topic Partition: 11 Leader: 1 Replicas: > > 1,3,2 Isr: 1 > > Topic: my_topic Partition: 12 Leader: 1 Replicas: > > 2,3,1 Isr: 1,2,3 > > Topic: my_topic Partition: 13 Leader: 3 Replicas: > > 3,1,2 Isr: 1,2,3 > > Topic: my_topic Partition: 14 Leader: 1 Replicas: > > 1,2,3 Isr: 1,2,3 > > Topic: my_topic Partition: 15 Leader: 1 Replicas: > > 2,1,3 Isr: 1,2,3 > > > > I looks like it didn't help and also the reassignment caused some > > disconnects from Zookeeper on all nodes which triggered some alerts in > > my monitoring. > > > > I also checked the logs and found that those partitions had last > > activity on some days ago and last action was for each of them some > > rolling to new log segment. Which was logged on all three nodes and > > then it seems like only node1 remained ISR... It also didn't happen > > for all missing partitions at the same time. Partition 5 stopped on > > 27th, partition 8 stopped on 25th... And so far only one topic is > > affected. > > > > Thanks > > > > Sebastian > > > > > > On 30-Sep-19 2:56 PM, Sebastian Schmitz wrote: > >> Hello, > >> > >> I just ran the kafka-reassign-partitions with --generate to create > >> the json and then with --execute to run it. > >> Now when checking with --verify I can see that the 4 partitions (it > >> now changed from only one partitions not having all in ISR to 12 not > >> being all in ISR) are successful, but the others are still in > >> progress.... That status remains: > >> > >> root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# ./kafka-topics.sh > >> --bootstrap-server localhost:9092 --topic my_topic --describe > >> Topic:my_topic PartitionCount:16 ReplicationFactor:3 > >> > Configs:segment.bytes=1073741824,message.format.version=2.3-IV1,retention.bytes=1073741824 > >> Topic: my_topic Partition: 0 Leader: 1 Replicas: > >> 2,3,1 Isr: 1 > >> Topic: my_topic Partition: 1 Leader: 1 Replicas: > >> 3,1,2 Isr: 1 > >> Topic: my_topic Partition: 2 Leader: 1 Replicas: > >> 1,2,3 Isr: 1 > >> Topic: my_topic Partition: 3 Leader: 1 Replicas: > >> 2,1,3 Isr: 1 > >> Topic: my_topic Partition: 4 Leader: 1 Replicas: > >> 3,2,1 Isr: 1 > >> Topic: my_topic Partition: 5 Leader: 1 Replicas: > >> 1,3,2 Isr: 1 > >> Topic: my_topic Partition: 6 Leader: 1 Replicas: > >> 2,3,1 Isr: 1 > >> Topic: my_topic Partition: 7 Leader: 1 Replicas: > >> 3,1,2 Isr: 1 > >> Topic: my_topic Partition: 8 Leader: 1 Replicas: > >> 1,2,3 Isr: 1 > >> Topic: my_topic Partition: 9 Leader: 1 Replicas: > >> 2,1,3 Isr: 1 > >> Topic: my_topic Partition: 10 Leader: 1 Replicas: > >> 3,2,1 Isr: 1 > >> Topic: my_topic Partition: 11 Leader: 1 Replicas: > >> 1,3,2 Isr: 1 > >> Topic: my_topic Partition: 12 Leader: 1 Replicas: > >> 2,3,1 Isr: 1,3,2 > >> Topic: my_topic Partition: 13 Leader: 2 Replicas: > >> 3,1,2 Isr: 1,3,2 > >> Topic: my_topic Partition: 14 Leader: 3 Replicas: > >> 1,2,3 Isr: 1,3,2 > >> Topic: my_topic Partition: 15 Leader: 1 Replicas: > >> 2,1,3 Isr: 1,3,2 > >> > >> root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# > >> ./kafka-reassign-partitions.sh --bootstrap-server localhost:9092 > >> --zookeeper atazkafkp01.aucklandtransport.govt.nz:2181 > >> --reassignment-json-file move2.json --execute > >> Current partition replica assignment > >> > >> > {"version":1,"partitions":[{"topic":"my_topic","partition":7,"replicas":[2,3,1],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":14,"replicas":[3,2,1],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":8,"replicas":[3,1,2],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":4,"replicas":[2,1,3],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":3,"replicas":[1,3,2],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":13,"replicas":[2,1,3],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":1,"replicas":[2,3,1],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":15,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":6,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":11,"replicas":[3,1,2],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":0,"replicas":[1,2,3],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":12,"replicas":[1,3,2],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":10,"replicas":[2,3,1],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":2,"replicas":[3,1,2],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":9,"replicas":[1,3,2],"log_dirs":["any","any","any"]},{"topic":"my_topic","partition":5,"replicas":[3,2,1],"log_dirs":["any","any","any"]}]} > > >> > >> > >> Save this to use as the --reassignment-json-file option during rollback > >> Successfully started reassignment of partitions. > >> > >> root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# > >> ./kafka-reassign-partitions.sh --bootstrap-server localhost:9092 > >> --zookeeper atazkafkp01.aucklandtransport.govt.nz:2181 > >> --reassignment-json-file move2.json --verify > >> Status of partition reassignment: > >> Reassignment of partition my_topic-7 is still in progress > >> Reassignment of partition my_topic-14 completed successfully > >> Reassignment of partition my_topic-8 is still in progress > >> Reassignment of partition my_topic-4 is still in progress > >> Reassignment of partition my_topic-3 is still in progress > >> Reassignment of partition my_topic-13 completed successfully > >> Reassignment of partition my_topic-1 is still in progress > >> Reassignment of partition my_topic-15 completed successfully > >> Reassignment of partition my_topic-6 is still in progress > >> Reassignment of partition my_topic-11 is still in progress > >> Reassignment of partition my_topic-0 is still in progress > >> Reassignment of partition my_topic-12 completed successfully > >> Reassignment of partition my_topic-10 is still in progress > >> Reassignment of partition my_topic-2 is still in progress > >> Reassignment of partition my_topic-9 is still in progress > >> Reassignment of partition my_topic-5 is still in progress > >> > >> root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# ./kafka-topics.sh > >> --bootstrap-server localhost:9092 --topic my_topic --describe > >> Topic:my_topic PartitionCount:16 ReplicationFactor:3 > >> > Configs:segment.bytes=1073741824,message.format.version=2.3-IV1,retention.bytes=1073741824 > >> Topic: my_topic Partition: 0 Leader: 1 Replicas: > >> 2,3,1 Isr: 1 > >> Topic: my_topic Partition: 1 Leader: 1 Replicas: > >> 3,1,2 Isr: 1 > >> Topic: my_topic Partition: 2 Leader: 1 Replicas: > >> 1,2,3 Isr: 1 > >> Topic: my_topic Partition: 3 Leader: 1 Replicas: > >> 2,1,3 Isr: 1 > >> Topic: my_topic Partition: 4 Leader: 1 Replicas: > >> 3,2,1 Isr: 1 > >> Topic: my_topic Partition: 5 Leader: 1 Replicas: > >> 1,3,2 Isr: 1 > >> Topic: my_topic Partition: 6 Leader: 1 Replicas: > >> 2,3,1 Isr: 1 > >> Topic: my_topic Partition: 7 Leader: 1 Replicas: > >> 3,1,2 Isr: 1 > >> Topic: my_topic Partition: 8 Leader: 1 Replicas: > >> 1,2,3 Isr: 1 > >> Topic: my_topic Partition: 9 Leader: 1 Replicas: > >> 2,1,3 Isr: 1 > >> Topic: my_topic Partition: 10 Leader: 1 Replicas: > >> 3,2,1 Isr: 1 > >> Topic: my_topic Partition: 11 Leader: 1 Replicas: > >> 1,3,2 Isr: 1 > >> Topic: my_topic Partition: 12 Leader: 1 Replicas: > >> 2,3,1 Isr: 1,3,2 > >> Topic: my_topic Partition: 13 Leader: 2 Replicas: > >> 3,1,2 Isr: 1,3,2 > >> Topic: my_topic Partition: 14 Leader: 3 Replicas: > >> 1,2,3 Isr: 1,3,2 > >> Topic: my_topic Partition: 15 Leader: 1 Replicas: > >> 2,1,3 Isr: 1,3,2 > >> > >> root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# > >> ./kafka-reassign-partitions.sh --bootstrap-server localhost:9092 > >> --zookeeper atazkafkp01.aucklandtransport.govt.nz:2181 > >> --reassignment-json-file move2.json --verify > >> Status of partition reassignment: > >> Reassignment of partition my_topic-7 is still in progress > >> Reassignment of partition my_topic-14 completed successfully > >> Reassignment of partition my_topic-8 is still in progress > >> Reassignment of partition my_topic-4 is still in progress > >> Reassignment of partition my_topic-3 is still in progress > >> Reassignment of partition my_topic-13 completed successfully > >> Reassignment of partition my_topic-1 is still in progress > >> Reassignment of partition my_topic-15 completed successfully > >> Reassignment of partition my_topic-6 is still in progress > >> Reassignment of partition my_topic-11 is still in progress > >> Reassignment of partition my_topic-0 is still in progress > >> Reassignment of partition my_topic-12 completed successfully > >> Reassignment of partition my_topic-10 is still in progress > >> Reassignment of partition my_topic-2 is still in progress > >> Reassignment of partition my_topic-9 is still in progress > >> Reassignment of partition my_topic-5 is still in progress > >> > >> I also checked Zookeeper for active brokers: > >> > >> root@kafka_node_1:/opt/kafka_2.12-2.3.0/bin# ./zookeeper-shell.sh > >> node1:2181 ls /brokers/ids > >> Connecting to node1:2181 > >> > >> WATCHER:: > >> > >> WatchedEvent state:SyncConnected type:None path:null > >> [1, 2, 3] > >> > >> What's next? > >> > >> Thanks > >> > >> Sebastian > >> > >> > >> On 26-Sep-19 10:04 PM, M. Manna wrote: > >>> hello, > >>> > >>> Could you please try to run kafka-reassign-partitions with your topic > >>> reassignment JSON? That doesn't require any restart, and should tell > >>> you if > >>> any issues with reassignment. The examples are provided in > >>> Confluence wiki. > >>> > >>> I would recommend that you do a "Describe" on your topic to ensure > >>> that all > >>> partitions and ISR metadata is up-to-date. > >>> > >>> Thanks, > >>> > >>> > >>> > >>> On Thu, 26 Sep 2019 at 03:28, Sebastian Schmitz < > >>> sebastian.schm...@propellerhead.co.nz> wrote: > >>> > >>>> Hello, > >>>> > >>>> I have one topic with 12 partitions where the partition 0 is > >>>> missing one > >>>> node from ISR... Is there a way I get get it back to work again > >>>> without > >>>> having to do some weird stuff like restarting the cluster? > >>>> Because this missing node in ISR is causing some problems for the > >>>> consumers... > >>>> > >>>> Thx > >>>> > >>>> Sebastian > >>>> > >>>> > >>>> -- > >>>> DISCLAIMER > >>>> This email contains information that is confidential and which > >>>> may be > >>>> legally privileged. If you have received this email in error please > >>>> > >>>> notify the sender immediately and delete the email. > >>>> This email is intended > >>>> solely for the use of the intended recipient and you may not use or > >>>> disclose this email in any way. > >>>> > > -- > DISCLAIMER > This email contains information that is confidential and which > may be > legally privileged. If you have received this email in error please > > notify the sender immediately and delete the email. > This email is intended > solely for the use of the intended recipient and you may not use or > disclose this email in any way. >