[ https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065806#comment-14065806 ]
Jay Kreps commented on KAFKA-1546: ---------------------------------- I think this is actually a really important thing to get right to make replication reliable. There are some subtleties. It would be good to work out the basics of how this could work on this JIRA. For example the throughput on a partition might be 1 msg/sec. But that is because only 1 msg/sec is being written by the producer. However if someone writes a batch of 1000 messages, that doesn't mean we are necessarily 1000 seconds behind. We already track the time since the last fetch request. So if the fetcher stops entirely for too long it will be caught. I think the other condition we want to be able to catch is one where the fetcher is still fetching but it is behind and likely won't catch up. One way to make "caught-up" concrete is to say that the last fetch went to the end of the log. We potentially reduce this to one config and just have replica.lag.time.ms which would both be the maximum time since a fetch or the maximum amount of time without catching up to the leader. The implementation would be that every time a fetch didn't go to the logEndOffset we would set the lag clock and it would only reset when a fetch request finally went all the way to the logEndOffset. > Automate replica lag tuning > --------------------------- > > Key: KAFKA-1546 > URL: https://issues.apache.org/jira/browse/KAFKA-1546 > Project: Kafka > Issue Type: Improvement > Components: replication > Affects Versions: 0.8.0, 0.8.1, 0.8.1.1 > Reporter: Neha Narkhede > Labels: newbie++ > > Currently, there is no good way to tune the replica lag configs to > automatically account for high and low volume topics on the same cluster. > For the low-volume topic it will take a very long time to detect a lagging > replica, and for the high-volume topic it will have false-positives. > One approach to making this easier would be to have the configuration > be something like replica.lag.max.ms and translate this into a number > of messages dynamically based on the throughput of the partition. -- This message was sent by Atlassian JIRA (v6.2#6252)