Re: Improving hbase read performance

stack Tue, 17 Feb 2009 21:14:04 -0800

On Tue, Feb 17, 2009 at 11:29 AM, shourabh rawat <[email protected]>wrote:


> Thanx for replying....
> Well the problem is this.
> I have a distributed setup of hbase over hadoop(a cluster of 3).
> I have loaded around 4 millions entries into my hbase.
> Now i want to read on it.(read a set of entries)
> Reading sequentially adds on the performance.


What do you mean by the above when you say read sequentially? Are you
scanning? (Getting a scanner and then nexting through your hbase table?).


>
> I want really good performance (i mean retrieval should be well within
> 10 ms per entry on an average)


You will have to wait for hbase 0.20.0 or do as Erik suggests and put a
cache in front of hbase.  What are you trying to do with hbase?  Serve a
website?



>
> So i thought of trying out the bulk read (but no such function on the hbase
> api)

 so i resorted to threads...created one htable instance per thread and

> did a get on the same table in parallel.
> But still the performance doesn't seem to get effected.
> Are u sure that the hbase treats them parallely or does it handle them
> sequentially even when thr are parallel request.


> Nyways wat is a good performance on a hbase...Any other way to improve
> on this performance...
> Can multiple instances of hbase be created (and not HTable as All the
> HTables are seem to be using the same connection
> i mean HConnection object).



Yeah, the RPC keeps a single connection per remote server but channel is
shared by request and receive.  Testing in past, the more remote servers,
the better, but even if a few only, concurrent HTables got better throughput
than one running requests in series (the single connection is not fully
occupied by requests and responses).

St.Ack


>
>
> Would be great if you could help me on this...and clear my concepts
>

Re: Improving hbase read performance

Reply via email to