Hi, I'm for publishing all performance metrics in JMX (in addition to exposing it wherever else you guys decide). That's because JMX is probably the easiest for our SPM for HBase [1] to get to HBase performance metrics and I suspect we are not alone.
Otis [1] http://sematext.com/spm/hbase-performance-monitoring/index.html ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ >________________________________ >From: Andrew Purtell <[email protected]> >To: Doug Meil <[email protected]>; "[email protected]" ><[email protected]> >Sent: Friday, July 29, 2011 4:34 PM >Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output > >> I'd rather see this output being able to be captured by something the >> sink that Todd suggested, rather than focusing on shell access. > > >I don't agree. > > >Look at what we have existing and proposed: > > - Java API access to server and region load information, that the shell >uses > > - A proposal to dump some stats into log files, that then has to be scraped > > - A proposal (by the FB guys) to export some JSON via a HTTP servlet > >This is not good design, this is a bunch of random shit stuck together. > >Note that what Todd proposed does not preclude adding Java client API support >for retrieving it. > >At a minimum all of this information must be accessible via the Java client >API, to enable programmatic monitoring and analysis use cases. I'll add the >shell support if nobody else cares about it, that is a relatively small >detail, but one I think is important. > >Best regards, > > > - Andy > > >Problems worthy of attack prove their worth by hitting back. - Piet Hein (via >Tom White) > > >>________________________________ >>From: Doug Meil <[email protected]> >>To: "[email protected]" <[email protected]>; "[email protected]" >><[email protected]> >>Sent: Friday, July 29, 2011 11:39 AM >>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output >> >> >>I'd rather see this output being able to be captured by something the sink >>that Todd suggested, rather than focusing on shell access. HServerLoad is >>super-summary at the RS level, and both the items in 4089 and 4147 are >>proposed to be "summarized" but still have reasonable detail (e.g., even >>table/CF summary there could be dozens of entries given a reasonably >>complex system). >> >> >> >> >>On 7/29/11 1:15 PM, "Andrew Purtell" <[email protected]> wrote: >> >>>There is also the matter of HServerLoad and how that is used by the shell >>>and master UI to report on cluster status. >>> >>>I'd like the shell to be able to let the user explore all of these >>>different reports interactively. >>> >>>At the very least, they should all be handled the same way. >>> >>>And then there is Riley's work over at FB on a slow query log. How does >>>that fit in? >>> >>>Best regards, >>> >>> >>> - Andy >>> >>>Problems worthy of attack prove their worth by hitting back. - Piet Hein >>>(via Tom White) >>> >>> >>>>________________________________ >>>>From: Todd Lipcon <[email protected]> >>>>To: [email protected] >>>>Sent: Friday, July 29, 2011 9:58 AM >>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output >>>> >>>>What I'd prefer is something like: >>>> >>>>interface BlockCacheReportSink { >>>> public void reportStats(BlockCacheReport report); >>>>} >>>> >>>>class LoggingBlockCacheReportSink { >>>> ... { >>>> log it with whatever formatting you want >>>> } >>>>} >>>> >>>>then a configuration which could default to the logging implementation, >>>>but >>>>orgs could easily substitute their own implementation. For example, I >>>>could >>>>see wanting to do an implementation where it keeps local RRD graphs of >>>>some >>>>stats, or pushes them to a central management server. >>>> >>>>The assumption is that BlockCacheReport is a fairly straightforward >>>>"struct" >>>>with the non-formatted information available. >>>> >>>>-Todd >>>> >>>>On Fri, Jul 29, 2011 at 4:15 AM, Doug Meil >>>><[email protected]>wrote: >>>> >>>>> >>>>> Hi Folks- >>>>> >>>>> You probably already my email yesterday on this... >>>>> https://issues.apache.org/jira/browse/HBASE-4089 (block cache report) >>>>> >>>>> ...and I just created this one... >>>>> https://issues.apache.org/jira/browse/HBASE-4147 (StoreFile query >>>>> report) >>>>> >>>>> What I'd like to run past the dev-list is this: if Hbase had periodic >>>>> summary usage statistics, where should they go? What I'd like to throw >>>>> out for discussion is that I'm suggesting that it should simply go to >>>>>the >>>>> log files and users can slice and dice this on their own. No UI (I.e., >>>>> JSPs), no JMX, etc. >>>>> >>>>> >>>>> The summary out the output is this: >>>>> BlockCacheReport: on configured interval, print out summary of >>>>>blockcache >>>>> (at table/CF level) to log file. This one is point-in-time, not delta. >>>>> >>>>> StoreFile read report: on configured interval, print out summary of >>>>> StoreFile accesses and how much time was spent reading each StoreFile >>>>>to >>>>> log file. >>>>> >>>>> Thoughts? >>>>> >>>>> Doug >>>>> >>>>> > >>>>> >>>>> >>>> >>>> >>>>-- >>>>Todd Lipcon >>>>Software Engineer, Cloudera >>>> >>>> >> >> >> >> > >
