[jira] [Comment Edited] (HBASE-21301) Heatmap for key access patterns

2018-10-31 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671045#comment-16671045
 ] 

Allan Yang edited comment on HBASE-21301 at 11/1/18 2:49 AM:
-

[~archana.katiyar] do you start a brand new cluster( no zk node) or start from 
an existing one?


was (Author: allan163):
[~archana.katiyar] do you start a brand new cluster( no zk node) or start from 
a existing one?

> Heatmap for key access patterns
> ---
>
> Key: HBASE-21301
> URL: https://issues.apache.org/jira/browse/HBASE-21301
> Project: HBase
>  Issue Type: Improvement
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21301.v1.patch
>
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer 
> helps debug performance issues due to unbalanced access patterns across the 
> key space, or single rows that are too large or receiving too much read or 
> write activity. With Key Visualizer, you get a heat map visualization of 
> access patterns over time, along with the ability to zoom into specific key 
> or time ranges, or select a specific row to find the full row key ID that's 
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud 
> Bigtable clusters with sufficient data or activity, and does not affect Cloud 
> Bigtable cluster performance. 
> <<<
> From 
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21301) Heatmap for key access patterns

2018-10-17 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653869#comment-16653869
 ] 

Andrew Purtell edited comment on HBASE-21301 at 10/17/18 5:05 PM:
--

Me
{quote}
Maybe we could sample reads and writes at the HRegion level and keep the 
derived stats in an in-memory data structure in the region. (Much lower 
overhead to keep it in-memory and local than attempt to persist to a real 
table.) We would persist relevant stats from this datastructure into the store 
files written during flushes and compactions."
{quote}

[~allan163] 
bq.  For example, we can record the hit count for a certain data block and keep 
the data in a memory structure. So that we can generate a heatmap for data 
block. I think it can narrow down the hot key in a smaller granularity than 
hfile range,which is too big.

I agree it can be done at the block granularity. We could store hit counts per 
block in meta blocks. 

Overall with the approach that records low level fine grained statistics into 
hfiles, it's easy to see how reads can be tracked this way, less clear what to 
do about writes. 

I advised [~archana.katiyar] to start with region granularity, building on the 
region level metrics for reads and writes that are already available, to lower 
the implementation effort for the first version of this. I also advised using 
the OpenTSDB schema as inspiration for efficient storage and extensibility. At 
this time this table would only store region read and write metrics to support 
this use case, but going forward the stats table will be available and 
potentially very useful for other use cases. I think this is another point in 
favor of using a table here. Above suggestions are great, especially enabling 
the date tiered compaction policy on the table by default. 

Also, we don't need to auto create the table if it doesn't exist, if that is 
going to be a problem. This is expected to be a one time only operation over 
the lifetime of a cluster. An admin can do it when setting up the cluster. We 
can document how to execute a small hbase shell script that creates the table 
where we also document how to enable the feature. 


was (Author: apurtell):
Me
{blockquote}
Maybe we could sample reads and writes at the HRegion level and keep the 
derived stats in an in-memory data structure in the region. (Much lower 
overhead to keep it in-memory and local than attempt to persist to a real 
table.) We would persist relevant stats from this datastructure into the store 
files written during flushes and compactions."
{blockquote}

[~allan163] 
bq.  For example, we can record the hit count for a certain data block and keep 
the data in a memory structure. So that we can generate a heatmap for data 
block. I think it can narrow down the hot key in a smaller granularity than 
hfile range,which is too big.

I agree it can be done at the block granularity. We could store hit counts per 
block in meta blocks. 

Overall with the approach that records low level fine grained statistics into 
hfiles, it's easy to see how reads can be tracked this way, less clear what to 
do about writes. 

I advised [~archana.katiyar] to start with region granularity, building on the 
region level metrics for reads and writes that are already available, to lower 
the implementation effort for the first version of this. I also advised using 
the OpenTSDB schema as inspiration for efficient storage and extensibility. At 
this time this table would only store region read and write metrics to support 
this use case, but going forward the stats table will be available and 
potentially very useful for other use cases. I think this is another point in 
favor of using a table here. Above suggestions are great, especially enabling 
the date tiered compaction policy on the table by default. 

Also, we don't need to auto create the table if it doesn't exist, if that is 
going to be a problem. This is expected to be a one time only operation over 
the lifetime of a cluster. An admin can do it when setting up the cluster. We 
can document how to execute a small hbase shell script that creates the table 
where we also document how to enable the feature. 

> Heatmap for key access patterns
> ---
>
> Key: HBASE-21301
> URL: https://issues.apache.org/jira/browse/HBASE-21301
> Project: HBase
>  Issue Type: Improvement
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Major
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud 

[jira] [Comment Edited] (HBASE-21301) Heatmap for key access patterns

2018-10-12 Thread Archana Katiyar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647873#comment-16647873
 ] 

Archana Katiyar edited comment on HBASE-21301 at 10/12/18 12:48 PM:


[~apurtell] I was working on the schema to store access stats, based on openTSB 
schema. openTSB is very good in terms of minimizing the row key size by using 
binary encoding. _But one overhead I see is to have a separate table to store 
UID mappings_. For our use-case, we need to store UID corresponding to table 
name, metric name (like read count\ write count) and region name etc.

We can have row key like -



I have started the key with table name instead of timestamp to better 
distribute the load.

In future, when we enable the stats for store file, we can append store file 
name uid also to the row key.


was (Author: archana.katiyar):
[~apurtell] I was working on the schema to store access stats, based on openTSB 
schema. openTSB is very good in terms of minimizing the row key size by using 
binary encoding. _But one overhead I see is to have a separate table to store 
UID mappings_. For our use-case, we need to store UID corresponding to table 
name, metric name (like read count\ write count) and region name etc.

We can have row key like -



I have started the key with table name instead of timestamp to better 
distribute the load.

In future, when we enable the stats for store file, we can append store file 
name uid also to the row key.

> Heatmap for key access patterns
> ---
>
> Key: HBASE-21301
> URL: https://issues.apache.org/jira/browse/HBASE-21301
> Project: HBase
>  Issue Type: Improvement
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Major
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer 
> helps debug performance issues due to unbalanced access patterns across the 
> key space, or single rows that are too large or receiving too much read or 
> write activity. With Key Visualizer, you get a heat map visualization of 
> access patterns over time, along with the ability to zoom into specific key 
> or time ranges, or select a specific row to find the full row key ID that's 
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud 
> Bigtable clusters with sufficient data or activity, and does not affect Cloud 
> Bigtable cluster performance. 
> <<<
> From 
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21301) Heatmap for key access patterns

2018-10-12 Thread Archana Katiyar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647849#comment-16647849
 ] 

Archana Katiyar edited comment on HBASE-21301 at 10/12/18 12:20 PM:


*Summary of the work done till now*: 
 * Store data in a HBase table (new system table)
 ** We will store stats for all the regions corresponding to a given table in 
this table.
 ** TODO: Decide upon the schema, [~andrew.purt...@gmail.com] suggested to take 
a reference from [openTSB 
schema|http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html].

 * Add a ScheduledChore in HRegionServer class; this Chore will wake up in 
every x minutes (configurable) and store the read\write count for last x 
minutes in table. In future, same chore can be utilized to store other stats 
also. There were two options to record read and write stats for last x minutes 
in HRegion class:
 ** Introduce new read and write counters which should increment based on the 
operation performed by the user. ScheduledChore should reset the new counters 
once it has recorded the current values.
 ** Use existing read and write counters of HRegion. ScheduledChore should take 
care of finding the stats for last x minutes (because existing counters keep 
track of the stats starting when region went live).

I am implementing using existing counters to make sure that performance impact 
per read\write operation is minimal because of this change.
 * Add a new jsp file which reads data from the table and displays in the form 
of heatmap. The logic of this jsp file is simple, use table name and epoch time 
to query stats for all the regions which were live at that time.

Also, [~apurtell] suggested to eventually store information per store file (may 
be not in v1 of this feature, but its a good goal to have). In his own words -

_"Regarding what granularity to use for statistics collection, you are 
definitely on the right track to start with the region as the smallest unit to 
consider. I believe Google's design of Key Visualizer can drill down to 
narrower, sub-region, scopes, so I have been thinking about how to achieve 
that, if we want. I would not recommend doing it for the first cut because we 
already have support for region level metrics that you can build on. However, 
imagine during compaction we collect statistics over all K-Vs in every HFile, 
then write the statistics into the hfile file trailer, then retrieve those 
statistics later using a new API. This will let us do things like alter 
compaction strategy decisions with greater awareness of the characteristics of 
the data in the store (see W-5473921 Enhance compaction upgrade decision to 
consider file statistics) or potentially generate heatmaps of key access rates 
at a store file granularity. Each store file will give you a key-range and a 
read and write access count that you can aggregate. The start and end keys of 
those ranges will be different from region start and end keys because store 
files only have a subset of all keys in the region. This lets us find hot 
regions that are narrower in scope than the region, which will be more precise 
information on how to, potentially, split the keyspace to better distribute the 
load, or to narrow down what aspect of application data model or implementation 
is responsible for the hotspot. I don't know _how_ to track key access stats 
with sub region granularity, though. We would need this information on hand to 
write into the hfile during compaction. Maybe we could sample reads and writes 
at the HRegion level and keep the derived stats in an in-memory data structure 
in the region. (Much lower overhead to keep it in-memory and local than attempt 
to persist to a real table.) We would persist relevant stats from this 
datastructure into the store files written during flushes and compactions."_

 


was (Author: archana.katiyar):
*Summary of the work done till now*: 
 * Store data in a HBase table (new system table)
 * We will store stats for all the regions corresponding to a given table in 
this table.
 * TODO: Decide upon the schema, [~andrew.purt...@gmail.com] suggested to take 
a reference from [openTSB 
schema|http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html].

 * Add a ScheduledChore in HRegionServer class; this Chore will wake up in 
every x minutes (configurable) and store the read\write count for last x 
minutes in table. In future, same chore can be utilized to store other stats 
also. There were two options to record read and write stats for last x minutes 
in HRegion class:
 * Introduce new read and write counters which should increment based on the 
operation performed by the user. ScheduledChore should reset the new counters 
once it has recorded the current values.
 * Use existing read and write counters of HRegion. ScheduledChore should take 
care of finding the stats for last x minutes