[ 
https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328413#comment-14328413
 ] 

Jiangjie Qin commented on KAFKA-1546:
-------------------------------------

I'm not sure if my concern is valid. If we have many producers producing 
messages to a partition, it's possible that after we fulfill a fetch request 
from replica fetcher but before we check if the log is caught up to log end or 
not, some new messages are appended. In this case, we will never be able to 
really caught up to log end. 
Maybe I understood it wrong, but I think what [~nehanarkhede] proposed before 
seems work, which is
1. Have a time criteria,  a fetch request must be received from the follower in 
10 secs.
2. Instead of a fixed number of max message lag, say 4000, use the number of 
(message-in-rate * maxLagMs) as the max message lag threshold.
This way we can handle both busy topics and low-volume topics.

> Automate replica lag tuning
> ---------------------------
>
>                 Key: KAFKA-1546
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1546
>             Project: Kafka
>          Issue Type: Improvement
>          Components: replication
>    Affects Versions: 0.8.0, 0.8.1, 0.8.1.1
>            Reporter: Neha Narkhede
>            Assignee: Aditya Auradkar
>              Labels: newbie++
>
> Currently, there is no good way to tune the replica lag configs to 
> automatically account for high and low volume topics on the same cluster. 
> For the low-volume topic it will take a very long time to detect a lagging
> replica, and for the high-volume topic it will have false-positives.
> One approach to making this easier would be to have the configuration
> be something like replica.lag.max.ms and translate this into a number
> of messages dynamically based on the throughput of the partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to