I think there's really two angles to look at this from... 1) What is 'important' to monitor? Meaning, what subset of these are important/critical for being able to tell system health (things you want to set alerts on), what subset are nice to have for overall health and capacity planning (things you want to create pretty graphs on) and the rest (not immediately useful in general, but can really help in a debugging/triage situation).
2) How do you get the data? Kind of independent of the above, though kinda related as well. As for the second one, you need to look at the collection mechanics. As you mentioned below, large scale polling (especially with a non-trivial number of beans) is expensive and problematic no matter how you do it (JMX or HTTP) given enough scale. I don't have much experience with the codahale metrics route directly, but I have messed with Jolokia, which is likely in the same boat - they expose the metrics for you to grab. In both cases, given enough data points (and kafka, depending on the number of topics involved, has a /lot/ of them), either can be slow if not implemented carefully. Meaning you may overrun your desired polling interval. In very large environments, I've found it very scalable to have either a local poller on the box (which could be reading via JMX or HTTP) which then emits the data to something or have some kind of wrapper around the application that does the collection/emission (launching the broker as a thread, and the parent process dows some JMX magic to connect to the data points). Both of these routes depend a lot on your monitoring infrastructure, but they will help you get around the general wide polling problem... Semi-shameless plug for how it is done at LinkedIn - http://engineering.linkedin.com/52/autometrics-self-service-metrics-collection -- Dave DeMaagd ddema...@linkedin.com | 818 262 7958 (dragos.manole...@servicenow.com - Wed, May 08, 2013 at 09:27:21PM +0000) > From the JmxReporter section of the metrics manual: > > Warning > We don¹t recommend that you try to gather metrics from your production > environment. JMX¹s RPC API is fragile and bonkers. For development > purposes and browsing, though, it can be very useful. > > > > -Dragos > > On 5/8/13 2:10 PM, "Otis Gospodnetic" <otis_gospodne...@yahoo.com> wrote: > > >Also, do you recommend getting metrics via JMX or via HTTP? >