[ https://issues.apache.org/jira/browse/KAFKA-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698170#comment-16698170 ]
Desmond Sindatry commented on KAFKA-7417: ----------------------------------------- I am seeing the same issue and its not possible to add a new broker. Out of sync replicas never come back to sync This is the error in the log: 2018-11-24 22:33:18,008 INFO kafka.server.ReplicaFetcherThread: [ReplicaFetcher replicaId=99, leaderId=98, fetcherId=0] Based on follower's leader epoch, leader replied with an offset 128406503 >= the follower's log end offset 127527919 in prod-raw-events-11. No truncation needed. 2018-11-24 22:33:18,008 INFO kafka.log.Log: [Log partition=prod-raw-events-11, dir=/kafka/data/sdh] Truncating to 127527919 has no effect as the largest offset in the log is 127527918 > Some topics lost / cannot recover their ISR status following broker crash > ------------------------------------------------------------------------- > > Key: KAFKA-7417 > URL: https://issues.apache.org/jira/browse/KAFKA-7417 > Project: Kafka > Issue Type: Bug > Components: replication > Affects Versions: 1.1.1, 2.0.0 > Reporter: Mikhail Khomenko > Priority: Major > > Hi, > we have faced with the next issue - some replicas cannot become in-sync. > Distribution of in-sync replicas amongst topics is random. For instance: > {code:java} > $ kafka-topics --zookeeper 1.2.3.4:8181 --describe --topic TEST > Topic:TEST PartitionCount:8 ReplicationFactor:3 Configs: > Topic: TEST Partition: 0 Leader: 2 Replicas: 0,2,1 Isr: 0,1,2 > Topic: TEST Partition: 1 Leader: 1 Replicas: 1,0,2 Isr: 0,1,2 > Topic: TEST Partition: 2 Leader: 2 Replicas: 2,1,0 Isr: 0,1,2 > Topic: TEST Partition: 3 Leader: 2 Replicas: 0,1,2 Isr: 0,1,2 > Topic: TEST Partition: 4 Leader: 1 Replicas: 1,2,0 Isr: 0,1,2 > Topic: TEST Partition: 5 Leader: 2 Replicas: 2,0,1 Isr: 0,1,2 > Topic: TEST Partition: 6 Leader: 0 Replicas: 0,2,1 Isr: 0,1,2 > Topic: TEST Partition: 7 Leader: 0 Replicas: 1,0,2 Isr: 0,2{code} > Files in segment TEST-7 are equal (the same md5sum) on all 3 brokers. Also > were checked by kafka.tools.DumpLogSegments - messages are the same. > We have 3-broker cluster configuration with Confluent Kafka 5.0.0 (it's > Apache Kafka 2.0.0). > Each broker has the next configuration: > {code:java} > advertised.host.name = null > advertised.listeners = PLAINTEXT://1.2.3.4:9200 > advertised.port = null > alter.config.policy.class.name = null > alter.log.dirs.replication.quota.window.num = 11 > alter.log.dirs.replication.quota.window.size.seconds = 1 > authorizer.class.name = > auto.create.topics.enable = true > auto.leader.rebalance.enable = true > background.threads = 10 > broker.id = 1 > broker.id.generation.enable = true > broker.interceptor.class = class > org.apache.kafka.server.interceptor.DefaultBrokerInterceptor > broker.rack = null > client.quota.callback.class = null > compression.type = producer > connections.max.idle.ms = 600000 > controlled.shutdown.enable = true > controlled.shutdown.max.retries = 3 > controlled.shutdown.retry.backoff.ms = 5000 > controller.socket.timeout.ms = 30000 > create.topic.policy.class.name = null > default.replication.factor = 3 > delegation.token.expiry.check.interval.ms = 3600000 > delegation.token.expiry.time.ms = 86400000 > delegation.token.master.key = null > delegation.token.max.lifetime.ms = 604800000 > delete.records.purgatory.purge.interval.requests = 1 > delete.topic.enable = true > fetch.purgatory.purge.interval.requests = 1000 > group.initial.rebalance.delay.ms = 3000 > group.max.session.timeout.ms = 300000 > group.min.session.timeout.ms = 6000 > host.name = > inter.broker.listener.name = null > inter.broker.protocol.version = 2.0 > leader.imbalance.check.interval.seconds = 300 > leader.imbalance.per.broker.percentage = 10 > listener.security.protocol.map = > PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL > listeners = PLAINTEXT://0.0.0.0:9200 > log.cleaner.backoff.ms = 15000 > log.cleaner.dedupe.buffer.size = 134217728 > log.cleaner.delete.retention.ms = 86400000 > log.cleaner.enable = true > log.cleaner.io.buffer.load.factor = 0.9 > log.cleaner.io.buffer.size = 524288 > log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308 > log.cleaner.min.cleanable.ratio = 0.5 > log.cleaner.min.compaction.lag.ms = 0 > log.cleaner.threads = 1 > log.cleanup.policy = [delete] > log.dir = /tmp/kafka-logs > log.dirs = /var/lib/kafka/data > log.flush.interval.messages = 9223372036854775807 > log.flush.interval.ms = null > log.flush.offset.checkpoint.interval.ms = 60000 > log.flush.scheduler.interval.ms = 9223372036854775807 > log.flush.start.offset.checkpoint.interval.ms = 60000 > log.index.interval.bytes = 4096 > log.index.size.max.bytes = 10485760 > log.message.downconversion.enable = true > log.message.format.version = 2.0 > log.message.timestamp.difference.max.ms = 9223372036854775807 > log.message.timestamp.type = CreateTime > log.preallocate = false > log.retention.bytes = -1 > log.retention.check.interval.ms = 300000 > log.retention.hours = 8760 > log.retention.minutes = null > log.retention.ms = null > log.roll.hours = 168 > log.roll.jitter.hours = 0 > log.roll.jitter.ms = null > log.roll.ms = null > log.segment.bytes = 1073741824 > log.segment.delete.delay.ms = 60000 > max.connections.per.ip = 2147483647 > max.connections.per.ip.overrides = > max.incremental.fetch.session.cache.slots = 1000 > message.max.bytes = 1000012 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 30000 > min.insync.replicas = 2 > num.io.threads = 8 > num.network.threads = 8 > num.partitions = 8 > num.recovery.threads.per.data.dir = 1 > num.replica.alter.log.dirs.threads = null > num.replica.fetchers = 4 > offset.metadata.max.bytes = 4096 > offsets.commit.required.acks = -1 > offsets.commit.timeout.ms = 5000 > offsets.load.buffer.size = 5242880 > offsets.retention.check.interval.ms = 600000 > offsets.retention.minutes = 525600 > offsets.topic.compression.codec = 0 > offsets.topic.num.partitions = 50 > offsets.topic.replication.factor = 3 > offsets.topic.segment.bytes = 104857600 > password.encoder.cipher.algorithm = AES/CBC/PKCS5Padding > password.encoder.iterations = 4096 > password.encoder.key.length = 128 > password.encoder.keyfactory.algorithm = null > password.encoder.old.secret = null > password.encoder.secret = null > port = 9092 > principal.builder.class = null > producer.purgatory.purge.interval.requests = 1000 > queued.max.request.bytes = -1 > queued.max.requests = 500 > quota.consumer.default = 9223372036854775807 > quota.producer.default = 9223372036854775807 > quota.window.num = 11 > quota.window.size.seconds = 1 > replica.fetch.backoff.ms = 1000 > replica.fetch.max.bytes = 1048576 > replica.fetch.min.bytes = 1 > replica.fetch.response.max.bytes = 10485760 > replica.fetch.wait.max.ms = 5000 > replica.high.watermark.checkpoint.interval.ms = 5000 > replica.lag.time.max.ms = 30000 > replica.socket.receive.buffer.bytes = 65536 > replica.socket.timeout.ms = 30000 > replication.quota.window.num = 11 > replication.quota.window.size.seconds = 1 > request.timeout.ms = 30000 > reserved.broker.max.id = 1000 > sasl.client.callback.handler.class = null > sasl.enabled.mechanisms = [GSSAPI] > sasl.jaas.config = null > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 60000 > sasl.kerberos.principal.to.local.rules = [DEFAULT] > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.login.callback.handler.class = null > sasl.login.class = null > sasl.login.refresh.buffer.seconds = 300 > sasl.login.refresh.min.period.seconds = 60 > sasl.login.refresh.window.factor = 0.8 > sasl.login.refresh.window.jitter = 0.05 > sasl.mechanism.inter.broker.protocol = GSSAPI > sasl.server.callback.handler.class = null > security.inter.broker.protocol = PLAINTEXT > socket.receive.buffer.bytes = 102400 > socket.request.max.bytes = 104857600 > socket.send.buffer.bytes = 102400 > ssl.cipher.suites = [] > ssl.client.auth = none > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] > ssl.endpoint.identification.algorithm = https > ssl.key.password = null > ssl.keymanager.algorithm = SunX509 > ssl.keystore.location = null > ssl.keystore.password = null > ssl.keystore.type = JKS > ssl.protocol = TLS > ssl.provider = null > ssl.secure.random.implementation = null > ssl.trustmanager.algorithm = PKIX > ssl.truststore.location = null > ssl.truststore.password = null > ssl.truststore.type = JKS > transaction.abort.timed.out.transaction.cleanup.interval.ms = 60000 > transaction.max.timeout.ms = 900000 > transaction.remove.expired.transaction.cleanup.interval.ms = 3600000 > transaction.state.log.load.buffer.size = 5242880 > transaction.state.log.min.isr = 2 > transaction.state.log.num.partitions = 50 > transaction.state.log.replication.factor = 3 > transaction.state.log.segment.bytes = 104857600 > transactional.id.expiration.ms = 604800000 > unclean.leader.election.enable = false > zookeeper.connect = 1.2.3.4:8181,1.2.3.5:8181,1.2.3.6:8181 > zookeeper.connection.timeout.ms = null > zookeeper.max.in.flight.requests = 10 > zookeeper.session.timeout.ms = 60000 > zookeeper.set.acl = false > zookeeper.sync.time.ms = 2000{code} > *History*: > - initially was working Confluent version 3.2.1 (Kakfa 0.10.2) > - we updated Confluent image to 4.1.1 (Kafka 1.1.1) according to > https://docs.confluent.io/4.1.1/installation/upgrade.html > - after a few days one of Kafka broker was restarted. Since that cluster > starts working strangely - broker 0 often was absent in ISR. > We have RF=3 for all topics and most topics had only 2 ISR while some of them > had all 3 ISR. > Unfortunately, cannot exactly point the moment after that this happened. > *Steps were done trying to fix this issue*: > - restarted all 3 brokers in rolling manner. Each time cluster controller was > restarted. After that an issue transferred to broker 1 instead of 0 > - changed replica.lag.time.max.ms: 10s -> 30s > - changed num.replica.fetchers: 1 -> 4 > - changed num.network.threads: 3 -> 8 > - because often preferred replica was not a leader, > kafka-preferred-replica-election was running for all topics. It was done a > few times > - CP version was upgraded to 5.0.0 (Kafka 2.0.0) > - changed zookeeper.session.timeout.ms: 6000 -> 60000 > - changed replica.fetch.wait.max.ms: 500 -> 5000 > > Any ideas how to fix it (excluding restarts of brokers)? > Many thanks in advance! -- This message was sent by Atlassian JIRA (v7.6.3#76005)