[ https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087418#comment-13087418 ]
Ming Ma commented on HBASE-4145: -------------------------------- Ah, thanks for pointing out this, Stack. We can use this for #3. The ClientScanner will call scan.setAttribute with well-defined metrics property names. TableInputFormat will call scan.getAttribute to access the metrics values and pass onto MapReduce framework as counters. > Provide metrics for hbase client > -------------------------------- > > Key: HBASE-4145 > URL: https://issues.apache.org/jira/browse/HBASE-4145 > Project: HBase > Issue Type: Improvement > Reporter: Ming Ma > Assignee: Ming Ma > > Sometimes it is useful to get some metrics from hbase client point of view. > This will help understand the metrics for scan/TableInputFormat map job > scenario. > What to capture, for example, for each ResultScanner object, > 1. The number of RPC calls to RSs. > 2. The delta time between consecutive RPC calls in the current serialized > scan implementation. > 3. The number of RPC retry to RSs. > 4. The number of NotServingRegionException got. > 5. The number of remote RPC calls. This excludes those call that hbase client > calls the RS on the same machine. > 6. The number of regions accessed. > How to capture > 1. Metrics framework works for a fixed number of metrics. It doesn't fit this > scenario. > 2. Use some TBD solution in HBase to capture such dynamic metrics. If we > assume there is a solution in HBase that HBase client can use to log such > kind of metrics, TableInputFormat can pass in mapreduce task ID as > application scan ID to HBase client as small addition to existing scan API; > and HBase client can log metrics accordingly with such ID. That will allow > query, analysis later on the metrics data for specific map reduce job. > 3. Expose via MapReduce counter. It lacks certain features, for example, > there is no good way to access the metrics on per map instance; the MapReduce > framework only performs sum on the counter values so it is tricky to find the > max of certain metrics in all mapper instances. However, it might be good > enough for now. With this approach, the metrics value will be available via > MapReduce counter. > a) Have ResultScanner return a new ResultScannerMetrics interface. > b) TableInputFormat will access data from ResultScannerMetrics and populate > MapReduce counters accordingly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira