Hi,

I'm for publishing all performance metrics in JMX (in addition to exposing it 
wherever else you guys decide).  That's because JMX is probably the easiest for 
our SPM for HBase [1] to get to HBase performance metrics and I suspect we are 
not alone.

Otis
[1] http://sematext.com/spm/hbase-performance-monitoring/index.html
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
Hadoop ecosystem search :: http://search-hadoop.com/



>________________________________
>From: Andrew Purtell <[email protected]>
>To: Doug Meil <[email protected]>; "[email protected]" 
><[email protected]>
>Sent: Friday, July 29, 2011 4:34 PM
>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>
>> I'd rather see this output being able to be captured by something the 
>> sink that Todd suggested, rather than focusing on shell access.
>
>
>I don't agree.
>
>
>Look at what we have existing and proposed:
>
>    - Java API access to server and region load information, that the shell 
>uses
>
>    - A proposal to dump some stats into log files, that then has to be scraped
>
>    - A proposal (by the FB guys) to export some JSON via a HTTP servlet
>
>This is not good design, this is a bunch of random shit stuck together. 
>
>Note that what Todd proposed does not preclude adding Java client API support 
>for retrieving it.
>
>At a minimum all of this information must be accessible via the Java client 
>API, to enable programmatic monitoring and analysis use cases. I'll add the 
>shell support if nobody else cares about it, that is a relatively small 
>detail, but one I think is important. 
>
>Best regards,
>
>
>    - Andy
>
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
>Tom White)
>
>
>>________________________________
>>From: Doug Meil <[email protected]>
>>To: "[email protected]" <[email protected]>; "[email protected]" 
>><[email protected]>
>>Sent: Friday, July 29, 2011 11:39 AM
>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>>
>>
>>I'd rather see this output being able to be captured by something the sink
>>that Todd suggested, rather than focusing on shell access.  HServerLoad is
>>super-summary at the RS level, and both the items in 4089 and 4147 are
>>proposed to be "summarized" but still have reasonable detail (e.g., even
>>table/CF summary there could be dozens of entries given a reasonably
>>complex system).
>>
>>
>>
>>
>>On 7/29/11 1:15 PM, "Andrew Purtell" <[email protected]> wrote:
>>
>>>There is also the matter of HServerLoad and how that is used by the shell
>>>and master UI to report on cluster status.
>>>
>>>I'd like the shell to be able to let the user explore all of these
>>>different reports interactively.
>>>
>>>At the very least, they should all be handled the same way.
>>>
>>>And then there is Riley's work over at FB on a slow query log. How does
>>>that fit in? 
>>> 
>>>Best regards,
>>>
>>>
>>>   - Andy
>>>
>>>Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>>(via Tom White)
>>>
>>>
>>>>________________________________
>>>>From: Todd Lipcon <[email protected]>
>>>>To: [email protected]
>>>>Sent: Friday, July 29, 2011 9:58 AM
>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>>>>
>>>>What I'd prefer is something like:
>>>>
>>>>interface BlockCacheReportSink {
>>>>  public void reportStats(BlockCacheReport report);
>>>>}
>>>>
>>>>class LoggingBlockCacheReportSink {
>>>>  ... {
>>>>    log it with whatever formatting you want
>>>>  }
>>>>}
>>>>
>>>>then a configuration which could default to the logging implementation,
>>>>but
>>>>orgs could easily substitute their own implementation. For example, I
>>>>could
>>>>see wanting to do an implementation where it keeps local RRD graphs of
>>>>some
>>>>stats, or pushes them to a central management server.
>>>>
>>>>The assumption is that BlockCacheReport is a fairly straightforward
>>>>"struct"
>>>>with the non-formatted information available.
>>>>
>>>>-Todd
>>>>
>>>>On Fri, Jul 29, 2011 at 4:15 AM, Doug Meil
>>>><[email protected]>wrote:
>>>>
>>>>>
>>>>> Hi Folks-
>>>>>
>>>>> You probably already my email yesterday on this...
>>>>>  https://issues.apache.org/jira/browse/HBASE-4089 (block cache report)
>>>>>
>>>>> ...and I just created this one...
>>>>>  https://issues.apache.org/jira/browse/HBASE-4147 (StoreFile query
>>>>> report)
>>>>>
>>>>> What I'd like to run past the dev-list is this:  if Hbase had periodic
>>>>> summary usage statistics, where should they go?  What I'd like to throw
>>>>> out for discussion is that I'm suggesting that it should simply go to
>>>>>the
>>>>> log files and users can slice and dice this on their own.  No UI (I.e.,
>>>>> JSPs), no JMX, etc.
>>>>>
>>>>>
>>>>> The summary out the output is this:
>>>>> BlockCacheReport:  on configured interval, print out summary of
>>>>>blockcache
>>>>> (at table/CF level) to log file. This one is point-in-time, not delta.
>>>>>
>>>>> StoreFile read report:  on configured interval, print out summary of
>>>>> StoreFile accesses and how much time was spent reading each StoreFile
>>>>>to
>>>>> log file.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Doug
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>
>>>>
>>>>-- 
>>>>Todd Lipcon
>>>>Software Engineer, Cloudera
>>>>
>>>>
>>
>>
>>
>>
>
>

Reply via email to