[ 
https://issues.apache.org/jira/browse/HBASE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786224#comment-16786224
 ] 

Sakthi commented on HBASE-21991:
--------------------------------

Oh yes! Thanks [~xucang], Somehow I overlooked that it's a concurrentHashMap, 
now, I can remove the synchronization. I still think there could be data 
consistency problems as we are maintaining 2 different data members to 
represent the state of the class. i.e. (requestsMap & registry). At any point 
of time we either want values to be put/registered in both, or removed from 
both. I feel operations such as below should be done by acquiring a lock to 
make the 2 operations atomic. What do you think?
{code:java}
...
registry.meter(requestMeter);
requestsMap.put(requestMeter, registry.get(requestMeter));
..
requestsMap.remove(meter);
registry.remove(meter);
...
{code}

> Fix MetaMetrics issues - [Race condition, Faulty remove logic], few 
> improvements
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-21991
>                 URL: https://issues.apache.org/jira/browse/HBASE-21991
>             Project: HBase
>          Issue Type: Bug
>          Components: Coprocessors, metrics
>            Reporter: Sakthi
>            Assignee: Sakthi
>            Priority: Major
>         Attachments: hbase-21991.master.001.patch
>
>
> Here is a list of the issues related to the MetaMetrics implementation:
> +*Bugs*+:
>  # [_Lossy counting for top-k_] *Faulty remove logic of non-eligible meters*: 
> Under certain conditions, we might end up storing/exposing all the meters 
> rather than top-k-ish
>  # MetaMetrics can throw NPE resulting in aborting of the RS because of a 
> *Race Condition*.
> +*Improvements*+:
>  # With high number of regions in the cluster, exposure of metrics for each 
> region blows up the JMX from ~140 Kbs to 100+ Mbs depending on the number of 
> regions. It's better to use *lossy counting to maintain top-k for region 
> metrics* as well.
>  # As the lossy meters do not represent actual counts, I think, it'll be 
> better to *rename the meters to include "lossy" in the name*. It would be 
> more informative while monitoring the metrics and there would be less 
> confusion regarding actual counts to lossy counts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to