Re: Help with Read Latency Metrics Interpretation

Jeremy Carroll Tue, 21 May 2013 09:31:19 -0700

There are more items then data in the BlockCache. There are bloom filters,
indexes, and data. You should look at your blockCacheHitCount for each of
these item times. You may have a high hit rate on blooms, but low on data
for example.


The fsRead<xxx> metrics are for items that read from disk. Not items that
are served from cache (I will look at the source code, but i'm fairly
certain that's the case).

Running MR jobs on a cluster serving live HBase traffic will have an
impact. The MR jobs are stealing IOPS from HBase. Your latency should
suffer during these times.


On Tue, May 21, 2013 at 8:20 AM, amit.mor.m...@gmail.com <
amit.mor.m...@gmail.com> wrote:

> I am on EC2, with HBase 0.94.2 and I can't find an explanation to this
> phenomena:
> I'm having these metrics on my RegionServers:
>
> blockCacheHitRatio=87%
> fsReadLatencyHistogramMean=  5886103.47
> fsReadLatencyHistogramMedian=6280445
> fsReadLatencyHistogram75th= 28117916.5
> fsReadLatencyHistogram95th= 53674180.05
>
> Now, I am wondering, if 87% of my reads are cached, what could be the
> reason for 75% of the reads falling on 28.1ms - that is, I expect that the
> 75th percentile reads would be from cache - and with cache performance, not
> to mention the 95th percentile. As far as I am concerned, the mean and
> median are great for our use case but the tails are too wide and thick.
>
> Secondly, I've noticed that when we're running mapreduce jobs, the number
> of requestsPerSecond and all fs* and *requestsCount are peaking - afaik the
> mapper (running cascading-maple) just reads directly from HDFS and should
> not have any influence on the RPC's (for example) to the RegionServer and
> the associated metrics, what is going on with the RS metrics ?
>
> Thanks
> Amit
>

Re: Help with Read Latency Metrics Interpretation

Reply via email to