[ https://issues.apache.org/jira/browse/ZOOKEEPER-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088755#comment-16088755 ]
ASF GitHub Bot commented on ZOOKEEPER-2770: ------------------------------------------- Github user karanmehta93 commented on the issue: https://github.com/apache/zookeeper/pull/307 @hanm @eribeiro Thank you for your comments. I have added a rebased patch with a test in `ZooKeeperServerMainTest` class. Please review whenever convenient. > It's odd to have two (or in future even more) types of latency checks scattered around which creates fragmentation w.r.t. the definition of what a request latency means. We can decide based on opinion from other people and I can update the patch accordingly. I will create new JIRA's based on @tdunning suggestions, after this one gets approved. I am eager to take them up. Documentation is still pending. Will add a patch for it soon. I am not sure if I have to modify the `ZooKeeperAdmin.xml` file or `ZooKeeperAdmin.html` file. Can you point me to any relevant stuff? > ZooKeeper slow operation log > ---------------------------- > > Key: ZOOKEEPER-2770 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2770 > Project: ZooKeeper > Issue Type: Improvement > Reporter: Karan Mehta > Assignee: Karan Mehta > Attachments: ZOOKEEPER-2770.001.patch, ZOOKEEPER-2770.002.patch, > ZOOKEEPER-2770.003.patch > > > ZooKeeper is a complex distributed application. There are many reasons why > any given read or write operation may become slow: a software bug, a protocol > problem, a hardware issue with the commit log(s), a network issue. If the > problem is constant it is trivial to come to an understanding of the cause. > However in order to diagnose intermittent problems we often don't know where, > or when, to begin looking. We need some sort of timestamped indication of the > problem. Although ZooKeeper is not a datastore, it does persist data, and can > suffer intermittent performance degradation, and should consider implementing > a 'slow query' log, a feature very common to services which persist > information on behalf of clients which may be sensitive to latency while > waiting for confirmation of successful persistence. > Log the client and request details if the server discovers, when finally > processing the request, that the current time minus arrival time of the > request is beyond a configured threshold. > Look at the HBase {{responseTooSlow}} feature for inspiration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)