[jira] [Comment Edited] (HBASE-21301) Heatmap for key access patterns

Archana Katiyar (JIRA) Fri, 12 Oct 2018 05:21:08 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647849#comment-16647849
 ]


Archana Katiyar edited comment on HBASE-21301 at 10/12/18 12:20 PM:
--------------------------------------------------------------------

*Summary of the work done till now*: 
 * Store data in a HBase table (new system table)
 ** We will store stats for all the regions corresponding to a given table in 
this table.
 ** TODO: Decide upon the schema, [[email protected]] suggested to take 
a reference from [openTSB 
schema|http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html].

 * Add a ScheduledChore in HRegionServer class; this Chore will wake up in 
every x minutes (configurable) and store the read\write count for last x 
minutes in table. In future, same chore can be utilized to store other stats 
also. There were two options to record read and write stats for last x minutes 
in HRegion class:
 ** Introduce new read and write counters which should increment based on the 
operation performed by the user. ScheduledChore should reset the new counters 
once it has recorded the current values.
 ** Use existing read and write counters of HRegion. ScheduledChore should take 
care of finding the stats for last x minutes (because existing counters keep 
track of the stats starting when region went live).

I am implementing using existing counters to make sure that performance impact 
per read\write operation is minimal because of this change.
 * Add a new jsp file which reads data from the table and displays in the form 
of heatmap. The logic of this jsp file is simple, use table name and epoch time 
to query stats for all the regions which were live at that time.

Also, [~apurtell] suggested to eventually store information per store file (may 
be not in v1 of this feature, but its a good goal to have). In his own words -

_"Regarding what granularity to use for statistics collection, you are 
definitely on the right track to start with the region as the smallest unit to 
consider. I believe Google's design of Key Visualizer can drill down to 
narrower, sub-region, scopes, so I have been thinking about how to achieve 
that, if we want. I would not recommend doing it for the first cut because we 
already have support for region level metrics that you can build on. However, 
imagine during compaction we collect statistics over all K-Vs in every HFile, 
then write the statistics into the hfile file trailer, then retrieve those 
statistics later using a new API. This will let us do things like alter 
compaction strategy decisions with greater awareness of the characteristics of 
the data in the store (see W-5473921 Enhance compaction upgrade decision to 
consider file statistics) or potentially generate heatmaps of key access rates 
at a store file granularity. Each store file will give you a key-range and a 
read and write access count that you can aggregate. The start and end keys of 
those ranges will be different from region start and end keys because store 
files only have a subset of all keys in the region. This lets us find hot 
regions that are narrower in scope than the region, which will be more precise 
information on how to, potentially, split the keyspace to better distribute the 
load, or to narrow down what aspect of application data model or implementation 
is responsible for the hotspot. I don't know _how_ to track key access stats 
with sub region granularity, though. We would need this information on hand to 
write into the hfile during compaction. Maybe we could sample reads and writes 
at the HRegion level and keep the derived stats in an in-memory data structure 
in the region. (Much lower overhead to keep it in-memory and local than attempt 
to persist to a real table.) We would persist relevant stats from this 
datastructure into the store files written during flushes and compactions."_

 


was (Author: archana.katiyar):
*Summary of the work done till now*: 
 * Store data in a HBase table (new system table)
 * We will store stats for all the regions corresponding to a given table in 
this table.
 * TODO: Decide upon the schema, [[email protected]] suggested to take 
a reference from [openTSB 
schema|http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html].

 * Add a ScheduledChore in HRegionServer class; this Chore will wake up in 
every x minutes (configurable) and store the read\write count for last x 
minutes in table. In future, same chore can be utilized to store other stats 
also. There were two options to record read and write stats for last x minutes 
in HRegion class:
 * Introduce new read and write counters which should increment based on the 
operation performed by the user. ScheduledChore should reset the new counters 
once it has recorded the current values.
 * Use existing read and write counters of HRegion. ScheduledChore should take 
care of finding the stats for last x minutes (because existing counters keep 
track of the stats starting when region went live).

I am implementing using existing counters to make sure that performance impact 
per read\write operation is minimal because of this change.
 * Add a new jsp file which reads data from the table and displays in the form 
of heatmap. The logic of this jsp file is simple, use table name and epoch time 
to query stats for all the regions which were live at that time.

Also, [~apurtell] suggested to eventually store information per store file (may 
be not in v1 of this feature, but its a good goal to have). In his own words -

_"Regarding what granularity to use for statistics collection, you are 
definitely on the right track to start with the region as the smallest unit to 
consider. I believe Google's design of Key Visualizer can drill down to 
narrower, sub-region, scopes, so I have been thinking about how to achieve 
that, if we want. I would not recommend doing it for the first cut because we 
already have support for region level metrics that you can build on. However, 
imagine during compaction we collect statistics over all K-Vs in every HFile, 
then write the statistics into the hfile file trailer, then retrieve those 
statistics later using a new API. This will let us do things like alter 
compaction strategy decisions with greater awareness of the characteristics of 
the data in the store (see W-5473921 Enhance compaction upgrade decision to 
consider file statistics) or potentially generate heatmaps of key access rates 
at a store file granularity. Each store file will give you a key-range and a 
read and write access count that you can aggregate. The start and end keys of 
those ranges will be different from region start and end keys because store 
files only have a subset of all keys in the region. This lets us find hot 
regions that are narrower in scope than the region, which will be more precise 
information on how to, potentially, split the keyspace to better distribute the 
load, or to narrow down what aspect of application data model or implementation 
is responsible for the hotspot. I don't know _how_ to track key access stats 
with sub region granularity, though. We would need this information on hand to 
write into the hfile during compaction. Maybe we could sample reads and writes 
at the HRegion level and keep the derived stats in an in-memory data structure 
in the region. (Much lower overhead to keep it in-memory and local than attempt 
to persist to a real table.) We would persist relevant stats from this 
datastructure into the store files written during flushes and compactions."_

 

> Heatmap for key access patterns
> -------------------------------
>
>                 Key: HBASE-21301
>                 URL: https://issues.apache.org/jira/browse/HBASE-21301
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Archana Katiyar
>            Assignee: Archana Katiyar
>            Priority: Major
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer 
> helps debug performance issues due to unbalanced access patterns across the 
> key space, or single rows that are too large or receiving too much read or 
> write activity. With Key Visualizer, you get a heat map visualization of 
> access patterns over time, along with the ability to zoom into specific key 
> or time ranges, or select a specific row to find the full row key ID that's 
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud 
> Bigtable clusters with sufficient data or activity, and does not affect Cloud 
> Bigtable cluster performance. 
> <<<
> From 
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-21301) Heatmap for key access patterns

Reply via email to