What's your blockCacheHitCachingRatio ? It would tell you about the ratio
of scans requested from cache (default) to the scans actually served from
the block cache. You can get that from the RS web ui. What you are seeing
can almost map to anything, for example: is scanner caching (client side)
What is that you are observing now?
Regards
Ram
On Mon, Jun 3, 2013 at 2:00 PM, Liu, Raymond raymond@intel.com wrote:
Hi
If all the data is already in RS blockcache.
Then what's the typical scan latency for scan a few rows from a
say several GB table ( with dozens of
HBase doesn't know all data are in the block cache. it had to look at
HTable firstly to get block_id(tablename + offset), then find it in the
block cache.
so if all data in the block cache, you just avoid to read data from hfile
directly, save some I/O time. but it depends on your data size.
if
Depends on so much environment related variables and on data as well.
But to give you a number after all:
One of our clusters is on EC2, 6 RS, on m1.xlarge machines (network
performance 'high' according to aws), with 90% of the time we do reads; our
avg data size is 2K, block cache at 20K, 100
Thanks Amit
In my envionment, I run a dozens of client to read about 5-20K data per scan
concurrently, And the average read latency for cached data is around 5-20ms.
So it seems there must be something wrong with my cluster env or application.
Or did you run that with multiple client?
Depends