[ 
https://issues.apache.org/jira/browse/KAFKA-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121935#comment-14121935
 ] 

Guozhang Wang commented on KAFKA-1616:
--------------------------------------

Updated reviewboard https://reviews.apache.org/r/25155/diff/
 against branch origin/trunk

> Purgatory Size and Num.Delayed.Request metrics are incorrect
> ------------------------------------------------------------
>
>                 Key: KAFKA-1616
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1616
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>             Fix For: 0.9.0
>
>         Attachments: KAFKA-1616.patch, KAFKA-1616_2014-08-28_10:12:17.patch, 
> KAFKA-1616_2014-09-01_14:41:56.patch, KAFKA-1616_2014-09-02_12:58:07.patch, 
> KAFKA-1616_2014-09-02_13:23:13.patch, KAFKA-1616_2014-09-03_12:53:09.patch, 
> KAFKA-1616_2014-09-04_13:26:02.patch
>
>
> The request purgatory used two atomic integers "watched" and "unsatisfied" to 
> record the purgatory size ( = watched + unsatisfied) and number of delayed 
> requests ( = unsatisfied). But due to some race conditions these two atomic 
> integers are not updated correctly, result in incorrect metrics.
> Proposed solution: to have a cleaner semantics, we can define the "purgatory 
> size" to be just the number of elements in the watched lists, and the "number 
> of delayed requests" to be just the length of the expiry queue. And instead 
> of using two atomic integeres we just compute the size of the lists / queue 
> on the fly each time the metrics are pulled. This may use some more CPU 
> cycles for these two metrics but should be minor, and the correctness is 
> guaranteed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to