Hi all,
        we have a cluster with 10 brokers, and our kafka version is 0.9.0.1,we 
repeatedly get our metric data such as offlinePartition metric from each broker 
with 2 minutes gap to achieve the goal of cluster’s monitor.
but accidental timeout occurs when we get data from some of brokers. which will 
leads to false alarm information.
        such as we may get exception as below.
        error: Failed to retrieve RMIServer stub: 
javax.naming.ServiceUnavailableException [Root exception is 
java.rmi.ConnectException: Connection refused to host: 10.11.12.13; nested 
exception is: 
        java.net.ConnectException: Connection timed out]
        
        we find our TcpExt.TCPBacklogDrop index is fluctuate repeatedly, may be 
this is some root cause. if it’s the problem. how can I optimize it.

        Any suggestion is appreciated. Thanks before.


Reply via email to