What are the graphs about? Can you say what the metric is, put it in context with others such as number of ops and something l like gc and then do it for a longer time?
On the start/stop of metrics system, it is not pretty, but that is the only recourse in hadoop metrics for clearing out metrics that are no longer being updated; in particular, when a region goes away either because it split or was removed, the stop/start operation is how associated metrics are removed. St.Ack On Fri, Aug 12, 2016 at 1:19 AM, Sterfield <[email protected]> wrote: > Hello, > > I have a small issue on Region Servers. I'm gathering all metrics using JMX > but from time to time, I can see some huge spikes in my graphs [1]. > > At first, I was thinking that the counter had wrapped, but by displaying > the "real" value (not the rate), you can see that two different values has > been sent to OpenTSDB [2] [3]. So when doing the derivative, the result is > a big number, far bigger than the others, hence the spike. > > By watching the log at the same time, I can see that the Hbase metrics > system has been stopped then restarted few seconds before. However this > restart of the metric systems happen every 5 minutes on the region servers, > and I don't see spikes in my metric systems so often. > > 2016-08-11 05:40:08,174 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: > Stopping HBase metrics system... > > Does anyone has some clue on what's happening ? > I'm using HBase 1.2.2. > > Thanks, > > Guillaume > > [1] : https://www.dropbox.com/s/d6v2rm7slvg27kk/JMX%20spike.png?dl=0 > [2] : > https://www.dropbox.com/s/ehktdebnzmu2ac7/JMX%20real%20value%201.png?dl=0 > [3] : > https://www.dropbox.com/s/75cy4gde8ifsfqy/JMX%20real%20value%202.png?dl=0 >
