nicmichael opened a new issue #2053: FastCodahale Timers miscalculate 
percentiles
URL: https://github.com/apache/bookkeeper/issues/2053
 
 
   **BUG REPORT**
   
   ***Describe the bug***
   
   FastCodahale Timer implementation may miscalculate percentiles if snapshots 
of values are slightly out of sync, and if only few events have been recorded.
   
   FastCodahale Timers use fine-grained locking and are meant to tolerate that 
(some) values change while being recorded or while snapshots are created. 
Currently, the total count of requests is not synchronized with the number of 
requests recorded in percentile buckets. If a snapshot is created while the 
total count of the timer has been incremented beyond the sum of values in the 
percentile buckets, the percentile calculation may produce wrong values.
   
   For example, if 3 percentile values have been recorded, but the overall 
count is 4, then the percentile calculation would be based on 4 values. This 
becomes most obvious if a percentile > .75 (e.g. p95) is being calculated. For 
this, the implementation will try to find 0.95 * 4 values, which is more than 
the 3 values recorded in the buckets. Since no bucket fulfills the criteria, 
the bound of the last (overflow) bucket will be returned, i.e. Long.MAX_VALUE.
   
   ***To Reproduce***
   
   I added a test case in my pull request to fix this bug which demonstrates 
the behavior.
   
   ***Expected behavior***
   
   Snapshots should base the percentile calculation on the sum of all values 
contained in the buckets, not on any value provided by a caller (which might be 
wrong or out of sync).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to