[ https://issues.apache.org/jira/browse/HBASE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454716#comment-13454716 ]
Elliott Clark commented on HBASE-4366: -------------------------------------- Seems like this has been addressed in 0.94+ we now have Per Region Metrics, Per CF metrics, and per block type. Are there other requirements or has this been completed ? > dynamic metrics logging > ----------------------- > > Key: HBASE-4366 > URL: https://issues.apache.org/jira/browse/HBASE-4366 > Project: HBase > Issue Type: New Feature > Components: metrics > Reporter: Ming Ma > Assignee: Ming Ma > > First, if there is existing solution for this, I would close this jira. Also > I realize we already have various overlapping solutions; creating another > solution isn't necessarily the best approach. However, I couldn't find > anything that can meet the need. So open this jira for discussion. > We have some scenarios in hbase/mapreduce/hdfs that requires logging large > number of dynamic metrics. They can be used for troubleshooting, better > measurement on the system and scorecard. For example, > > 1.HBase. Get metrics such as request per sec that are specific to a table, or > column family. > 2.Mapreduce Job history analysis. Would like to found out all the job ids > that are submitted, completed, etc. in a specific time window. > For troubleshooting, what people usually do today, 1) Use current > machine-level metrics to find out which machine has the issue. 2) go to that > machine, analysis the local log. > The characteristics of such kind of metrics: > > 1.It isn't something that can be predefined. The key such as table name, job > id is dynamic. > 2.The number of such metrics could be much larger than what the current > metrics framework can handle. > 3.We don't have a scenario that require near real time query support, e.g., > from the time the metrics is generated to the time it is available to query > can be at like an hour. > 4.How data is consumed is highly application specific. > Some ideas: > 1. Provide some interface for any application to log data. > 2. The metrics can be written to log files. The log files or log entries will > be loaded to HBase, or HDFS asynchronously. That could go to a separate > cluster. > 3. To consume such data, application could run map reduce job on the log > files for aggregation, or do random read directly from HBase. > Comments? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira