Question regarding scalability of regionservers

Brad McCarty Tue, 16 Feb 2010 19:29:24 -0800

Hello all,  we're looking at using HBase for the backend datastore for a 
large-scale site where many Tomcat app servers would access HBase in realtime.  
Our data access pattern is not completely random, we would have to access some 
common rows from many app servers.


I read in another post that if one has a "hot" row in a table, meaning very 
heavy read access to the same row, that the regionserver managing the region 
with that row can become a single bottleneck.  

Is my understanding accurate?  If so, then assuming I can cache the data in the 
memstore, will CPU utilization become the likely limiting resource on that 
regionserver?  Also, if I'm hitting the region server from many client servers 
(Tomcat app servers), will the socket connection management overhead on the 
regionserver overwhelm that server?

If that's true, are there any other steps that can be taken to mitigate that 
risk, other than buying bigger hardware?

Thanks very much,
Brad McCarty

Question regarding scalability of regionservers

Reply via email to