Hello,

I am designing an architecture for a website to show analytics on a huge
quantity of data. This data is stored in one HBase table and needs to be
accessed in a semi-random manner. Typically, a big block of rowkeys that
are contiguous will be read at once (say a few thousand rows) and some data
displayed based on them. Where these blocks are within the table will be
the random aspect.

I am trying to figure out how fast I can expect HBase to be. Is it
something where I am ok to link the webpage directly to HBase for this
reading and I can expect realtime page loads (<1 sec), or do I need to get
a distributed cache like Redis running to cache the data so that if the
user requests the same data over and over I don't waste time pulling it
from HBase if it has already been loaded.

In other words, generally speaking, are HBase and Redis/Memcached redundant
or is there a strong use case for using HBase as the on-disk storage and
Redis or Memcached for caching in memory to improve performance?

Thanks,
Scott

Reply via email to