Hi guys,

On what factors does HBase read latency primarily depend? What would be the
approx theoretical limit for read latency in v0.90.1 on a cluster of 7 nodes
(16 core/16 GB RAM on 5 machines and 36 GB on the other two)? I have an
application where I generate around 1000 rows/s to be input into HBase. Then
I have to read this data and process it at regular intervals. Write speed is
not a problem as the cluster seems to be able to write at the reqd. rate.
But while processing this data also, I would need a read speed of at least
1000 rows/s since I need to keep the processing speed at least equal to the
data generation speed. So far, I am getting around 200-300 rows/s only it
seems. I have LZO compression on the tables and I haven't tried in-memory
yet as my RAM usage is too high already while running jobs. Is it possible
to achieve this read speed, and what can I do to improve it? How far can
adding more nodes/more RAM help? Please let me know if the scope is too huge
to answer this question and if you need more details.

Thanks,
Hari

Reply via email to