They should be trying to get back into sync on their own.
Do you see any errors in broker logs?
Gwen
On Tue, Apr 21, 2015 at 10:15 AM, Thomas Kwan wrote:
> We have 5 kafka brokers available, and created a topic with replication
> factor of 3. After a few broker issues (e.g. went out of file descriptors),
> running kafkacat on the producer node shows the following:
>
> Command:
>
> kafkacat-CentOS-6.5-x86_64 -L -b "kafka01-east.manage.com,
> kafka02-east.manage.com,kafka03-east.manage.com,kafka04-east.manage.com,
> kafka05-east.manage.com"
>
> Output:
>
> 5 brokers:
> broker 385 at kafka04-east.manage.com:9092
> broker 389 at kafka03-east.manage.com:9092
> broker 381 at kafka01-east.manage.com:9092
> broker 387 at kafka05-east.manage.com:9092
> broker 383 at kafka02-east.manage.com:9092
> ...
> topic "raw-events" with 32 partitions:
> partition 23, leader 387, replicas: 389,387,381, isrs: 387,389
> partition 8, leader 389, replicas: 381,389,383, isrs: 389,381
> partition 17, leader 389, replicas: 383,389,381, isrs: 389,381
> partition 26, leader 387, replicas: 387,389,381, isrs: 387,389
> partition 11, leader 387, replicas: 389,387,381, isrs: 387,389
> partition 29, leader 389, replicas: 383,389,381, isrs: 389,381
> partition 20, leader 389, replicas: 381,389,383, isrs: 389,381
> partition 2, leader 387, replicas: 387,389,381, isrs: 387
> partition 5, leader 389, replicas: 383,389,381, isrs: 389,381
> partition 14, leader 387, replicas: 387,389,381, isrs: 387,389
> partition 4, leader 387, replicas: 381,387,389, isrs: 387,389
> partition 13, leader 387, replicas: 383,387,389, isrs: 387,389
> partition 22, leader 389, replicas: 387,383,389, isrs: 389,387
> partition 31, leader 387, replicas: 389,383,387, isrs: 387,389
> partition 7, leader 387, replicas: 389,383,387, isrs: 387,389
> partition 16, leader 387, replicas: 381,387,389, isrs: 387
> partition 25, leader 387, replicas: 383,387,389, isrs: 387,389
> partition 10, leader 387, replicas: 387,383,389, isrs: 387,389
> partition 1, leader 387, replicas: 383,387,389, isrs: 387,389
> partition 28, leader 387, replicas: 381,387,389, isrs: 387
> partition 19, leader 387, replicas: 389,383,387, isrs: 387,389
> partition 18, leader 387, replicas: 387,381,383, isrs: 387,381
> partition 9, leader 387, replicas: 383,381,387, isrs: 387,381
> partition 27, leader 389, replicas: 389,381,383, isrs: 389,381
> partition 12, leader 387, replicas: 381,383,387, isrs: 387,381
> partition 21, leader 387, replicas: 383,381,387, isrs: 387,381
> partition 3, leader 389, replicas: 389,381,383, isrs: 389,381
> partition 30, leader 387, replicas: 387,381,383, isrs: 387,381
> partition 15, leader 389, replicas: 389,381,383, isrs: 389,381
> partition 6, leader 387, replicas: 387,381,383, isrs: 387,381
> partition 24, leader 387, replicas: 381,383,387, isrs: 387,381
> partition 0, leader 387, replicas: 381,383,387, isrs: 387,381
>
> I notice that some partition (partition #2 for example) only has 1 node
> under isrs. From what I read, isrs shows a list of brokers that have data
> that is in-sync.
>
> My question is - now some partitions are out of sync. What do I do to get
> them in sync again?
>
> thanks
> thomas