[ https://issues.apache.org/jira/browse/ZOOKEEPER-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathieu Gaudin resolved ZOOKEEPER-4358. --------------------------------------- Resolution: Not A Problem > Latency metrics showing surprising results for a keberos-enabled cluster > ------------------------------------------------------------------------ > > Key: ZOOKEEPER-4358 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4358 > Project: ZooKeeper > Issue Type: Bug > Components: metric system > Affects Versions: 3.6.2 > Reporter: Mathieu Gaudin > Priority: Minor > Attachments: image-2021-08-27-16-10-28-783.png, > image-2021-08-27-16-37-50-112.png > > > Hi, > I'm trying to understand why the values of min/avg/max latency are showing > surprising results. The graph below shows the max latency value of a > particular node for last 7 days. The value increases gradually over time and > it only ever decreases when the node gets restarted as if the metric value > gets reset. > [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerStats.java#L226] > !image-2021-08-27-16-10-28-783.png|width=984,height=204! > * 3 nodes > * Keberos enabled > * TGT ticket cashe enabled. > I believes the values of min/avg/max latency should show more realistic > variations. It's very unlikely that the max latency value is expected to > always increase while the node is running. > [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerStats.java#L142] > _public void updateLatency(Request request, long currentTime) {_ > _long latency = currentTime - request.createTime;_ > _if (latency < 0) {_ > _return;_ > _}_ > _*{color:#FF0000}requestLatency.addDataPoint(latency);{color}*_ > _if (request.getHdr() != null) {_ > _// Only quorum request should have header_ > _ServerMetrics.getMetrics().UPDATE_LATENCY.add(latency);_ > _} else {_ > _// All read request should goes here_ > _ServerMetrics.getMetrics().READ_LATENCY.add(latency);_ > _}_ > The method called let me think that the max latency metric gets set if the > current values happens to be lower. __ > [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/metric/AvgMinMaxCounter.java#L51] > _private void setMax(long value) {_ > *{color:#FF0000}_long current;_{color}* > *{color:#FF0000}_while (value > (current = max.get()) && > !max.compareAndSet(current, value)) {_{color}* > _// no op_ > _}_ > _}_ > I put below a graph of a particular from a totally different cluster for last > 2 days. The node has not been restarted and all the data is from the same > process. We can see a more realistic variations of the max latency metric as > it would normally. > !image-2021-08-27-16-37-50-112.png|width=1084,height=222! > Thanks for you time in advance, > Math -- This message was sent by Atlassian Jira (v8.20.10#820010)