Hello all, we're looking at using HBase for the backend datastore for a large-scale site where many Tomcat app servers would access HBase in realtime. Our data access pattern is not completely random, we would have to access some common rows from many app servers.
I read in another post that if one has a "hot" row in a table, meaning very heavy read access to the same row, that the regionserver managing the region with that row can become a single bottleneck. Is my understanding accurate? If so, then assuming I can cache the data in the memstore, will CPU utilization become the likely limiting resource on that regionserver? Also, if I'm hitting the region server from many client servers (Tomcat app servers), will the socket connection management overhead on the regionserver overwhelm that server? If that's true, are there any other steps that can be taken to mitigate that risk, other than buying bigger hardware? Thanks very much, Brad McCarty