[ 
https://issues.apache.org/jira/browse/KAFKA-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434271#comment-13434271
 ] 

Joel Koshy commented on KAFKA-385:
----------------------------------

Just want to add a comment on metrics overheads of using histograms, meters
and gauges.

Meters use an exponential weighted moving average and don't need to maintain
past data.  Histograms are implemented using reservoir sampling and need to
maintain a sampling array. There is only one per-key histogram - the
"follower catch-up time" histogram in DelayedProducerRequestMetrics. There
are four more global histograms.

The sampling array is 1028 atomic longs by default. So if we have say, 500
topics with four partitions each, four brokers, assume that leadership is
evenly spread out, a *lower* bound (if we ignore object overheads and the
global histograms) on memory would be ~ 4MB per broker.

Also, after looking at the metrics code a little more, I think we should use
the exponentially decaying sampling option. By default, the histogram uses a
uniform sample - i.e., it effectively takes a uniform sample from all the
seen data points. Exponential decaying sampling gives more weight to the
past five minutes of data - but it would use at least double the memory of
uniform sampling which pushes memory usage to over 8MB.

                
> RequestPurgatory enhancements - expire/checkSatisfy issue; add jmx beans
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-385
>                 URL: https://issues.apache.org/jira/browse/KAFKA-385
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>             Fix For: 0.8
>
>         Attachments: example_dashboard.jpg, graphite_explorer.jpg, 
> KAFKA-385-v1.patch, KAFKA-385-v2.patch
>
>
> As discussed in KAFKA-353:
> 1 - There is potential for a client-side race condition in the 
> implementations of expire and checkSatisfied. We can just synchronize on the 
> DelayedItem.
> 2 - Would be good to add jmx beans to facilitate monitoring RequestPurgatory 
> stats.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to