[ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056825#comment-13056825 ]
Jason Rutherglen commented on HBASE-4038: ----------------------------------------- @Nicolas Hot row handling would benefit greatly from a row level LRU cache, as described in the BigTable paper. With a row cache, the 'cost' of the hotness (seeking into the block) will be minimized to a hash lookup. Though agreed that general diagnosis will/could be required to turn on row caching. > Hot Region : Write Diagnosis > ---------------------------- > > Key: HBASE-4038 > URL: https://issues.apache.org/jira/browse/HBASE-4038 > Project: HBase > Issue Type: Improvement > Components: client, regionserver > Affects Versions: 0.92.0 > Reporter: Nicolas Spiegelberg > Assignee: Riley Patterson > Priority: Minor > > We should provide a basic way for end users to operationally diagnose hot row > problems. Thinking about a 2-phase approach: > 1. Diagnose hot regions > 2. Inspect those regions/servers to find the hot rows. > To diagnose hot regions, we could query the master or regionservers for these > regions + sort. To inspect the regions for hot rows, we could write another > script to analyze the HLogs on a server and basically do: sort log|uniq > -n|sort -n|top -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira