[ https://issues.apache.org/jira/browse/KAFKA-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717277#comment-15717277 ]
ASF GitHub Bot commented on KAFKA-4485: --------------------------------------- GitHub user lindong28 opened a pull request: https://github.com/apache/kafka/pull/2208 KAFKA-4485; Follower should be in the isr if its FetchRequest has fetched up to the logEndOffset of leader You can merge this pull request into a Git repository by running: $ git pull https://github.com/lindong28/kafka KAFKA-4485 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2208.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2208 ---- commit c2c24839708ac0e60214f69606813d7aeff6ed21 Author: Dong Lin <lindon...@gmail.com> Date: 2016-12-03T03:32:59Z KAFKA-4485; Follower should be in the isr if its FetchRequest has fetched up to the logEndOffset of leader ---- > Follower should be in the isr if its FetchRequest has fetched up to the > logEndOffset of leader > ---------------------------------------------------------------------------------------------- > > Key: KAFKA-4485 > URL: https://issues.apache.org/jira/browse/KAFKA-4485 > Project: Kafka > Issue Type: Bug > Reporter: Dong Lin > Assignee: Dong Lin > > As of current implementation, we will exclude follower from ISR if the begin > offset of FetchRequest from this follower is always smaller than logEndOffset > of leader for more than replicaLagTimeMaxMs. > Also, we will add a follower to ISR if the beginOffset of FetchRequest from > this follower is equal or larger than high watermark of this partition. > This is problematic for the following reasons: > 1) The criteria for ISR is inconsistent between maybeExpandIsr() and > maybeShrinkIsr(). A follower may be repeatedly remove and added to the ISR > (e.g. in the scenario described below). > 2) A follower may be removed from the ISR even if its fetch rate can keep up > with produce rate. Suppose a produce keeps producing a lot of small requests > at high request but low byte rate, the fetch request is always able to read > all the available data at the time leader receives it. However, the begin > offset of fetch request will always be smaller than logEndOffset of leader. > Thus the follower will be removed from ISR. > The solution to the problem is the following: > A follower should be in ISR if begin offset of its FetchRequest >= max(high > watermark of partition, log end offset of leader at the time the leader > receives the previous FetchRequest). The follower should be removed from ISR > if this criteria is not met for more than replicaLagTimeMaxMs. > This solution makes the following guarantee: > 1) If a follower is in ISR, then its log end offset >= high watermark of > partition at least sometime in the last replicaLagTimeMaxMs. > 2) If a follower is not in ISR, then the end offset of its FetchRequest can > not catch up with log end offset of leader for more than replicaLagTimeMaxMs. > Either follower is in bootstrap phase, or the follower's average fetch rate < > produce rate into the partition for more than replicaLagTimeMaxMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)