[jira] [Commented] (ZOOKEEPER-2770) ZooKeeper slow operation log

ASF GitHub Bot (JIRA) Wed, 12 Jul 2017 21:46:52 -0700

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085171#comment-16085171
 ]


ASF GitHub Bot commented on ZOOKEEPER-2770:
-------------------------------------------

Github user karanmehta93 commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/307#discussion_r127127351
  
    --- Diff: 
src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java ---
    @@ -430,6 +432,7 @@ public void processRequest(Request request) {
                 // the client and leader disagree on where the client is most
                 // recently attached (and therefore invalid SESSION MOVED 
generated)
                 cnxn.sendCloseSession();
    +            request.checkLatency();
    --- End diff --
    
    @eribeiro 
    That is exactly the reason that I created my own function after the 
`cnxn.sendCloseSession()` since I was interested in end to end latency, and 
didn't use the `zks.serverStats().updateLatency(request.createTime);` method. I 
would also like to know opinion from others on this. 
    Should we scope this JIRA for adding threshold as a general and create a 
new JIRA for extending the scope to include percentiles with new data-structure 
inside the `ServerStats` class?


> ZooKeeper slow operation log
> ----------------------------
>
>                 Key: ZOOKEEPER-2770
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2770
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Karan Mehta
>         Attachments: ZOOKEEPER-2770.001.patch, ZOOKEEPER-2770.002.patch, 
> ZOOKEEPER-2770.003.patch
>
>
> ZooKeeper is a complex distributed application. There are many reasons why 
> any given read or write operation may become slow: a software bug, a protocol 
> problem, a hardware issue with the commit log(s), a network issue. If the 
> problem is constant it is trivial to come to an understanding of the cause. 
> However in order to diagnose intermittent problems we often don't know where, 
> or when, to begin looking. We need some sort of timestamped indication of the 
> problem. Although ZooKeeper is not a datastore, it does persist data, and can 
> suffer intermittent performance degradation, and should consider implementing 
> a 'slow query' log, a feature very common to services which persist 
> information on behalf of clients which may be sensitive to latency while 
> waiting for confirmation of successful persistence.
> Log the client and request details if the server discovers, when finally 
> processing the request, that the current time minus arrival time of the 
> request is beyond a configured threshold. 
> Look at the HBase {{responseTooSlow}} feature for inspiration. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ZOOKEEPER-2770) ZooKeeper slow operation log

Reply via email to