RE: speed up reads in hBase

Sharma, Avani Tue, 29 Jun 2010 17:53:23 -0700

Thanks. That was an obvious problem , now that I look at it. I can read about 
10K records in 6.3seconds.


What are the other possible things I could try to make it even faster ? Any 
pointers are appreciated.

-Avani

-----Original Message-----
From: Jonathan Gray [mailto:[email protected]] 
Sent: Tuesday, June 29, 2010 4:35 PM
To: [email protected]
Subject: RE: speed up reads in hBase

Avani,

Are you including the time to instantiate the HTable?  Are you instantiating a 
new one each time?  Creating an HBaseConfiguration object will actually parse 
up the XML and all that.  You should just reuse the same HTable instance 
(within a thread) or if you have a multi-threaded program, then use HTablePool.

Are there multiple families on this table?

Obviously 1 second per Get is extraordinarily slow so there may be several 
things at play here.

JG

> -----Original Message-----
> From: Sharma, Avani [mailto:[email protected]]
> Sent: Tuesday, June 29, 2010 4:05 PM
> To: [email protected]
> Subject: RE: speed up reads in hBase
> 
> Max 6 versions. The rows are very small. I am currently running a
> prototype to see if HBase will work for our application in a real-time
> environment.
> The initial result shows slower than what-we-need performance.
> 
> I am not querying older versions. I am querying the latest in the
> experiment results I sent across.
> I ran experiments, and the timing doesn't change for querying 6
> versions old data or by not using versioning (instead create
> key_timestamp as the row key).
> 
> 
> I was hoping there would be ways to optimize the querying of latest
> version (forget about older versions ).
> 
> -Avani
> 
> -----Original Message-----
> From: Todd Lipcon [mailto:[email protected]]
> Sent: Tuesday, June 29, 2010 3:21 PM
> To: [email protected]
> Subject: Re: speed up reads in hBase
> 
> Hi Avani,
> 
> There are currently some optimizations that Jonathan Gray is working on
> to
> make selection of specific time ranges more efficient.
> 
> How many versions are retained for the rows in this column family?
> 
> -Todd
> 
> On Tue, Jun 29, 2010 at 1:08 PM, Sharma, Avani <[email protected]>
> wrote:
> 
> > Rows are very small (like 50 bytes max). I am accessing the latest
> version
> > after setting  timerange.
> >
> >        HTable table = new HTable(new HBaseConfiguration(),
> table_name);
> >
> >        Get getRes = new Get(Bytes.toBytes(lkp_key));
> >
> >        long maxStamp  = new
> SimpleDateFormat("yyyyMMdd").parse(date_for_ts,
> > new ParsePosition(0)).getTime();
> >        getRes.setTimeRange(0, maxStamp);
> >
> >        Result r   = table.get(getRes);
> >        NavigableMap<byte[], byte[]> kvMap   =
> > r.getFamilyMap((Bytes.toBytes("data")));
> >
> > -----Original Message-----
> > From: Michael Segel [mailto:[email protected]]
> > Sent: Tuesday, June 29, 2010 12:46 PM
> > To: [email protected]
> > Subject: RE: speed up reads in hBase
> >
> >
> >
> > How wide are your rows?
> > Are you accessing the last version or pulling back all of the
> versions per
> > row?
> > > From: [email protected]
> > > To: [email protected]
> > > Date: Tue, 29 Jun 2010 12:11:46 -0700
> > > Subject: speed up reads in hBase
> > >
> > >
> > > I have about 2.8M rows in my HBase table with multiple versions (
> max 6)
> > .
> > >
> > > When I try to lookup 1000 records, it takes a total time of 20
> minutes !
> > Per read takes about a second or more.
> > > Will appreciate any pointers on speeding these ?
> > >
> > > Thanks,
> > > -Avani
> >
> > _________________________________________________________________
> > Hotmail has tools for the New Busy. Search, chat and e-mail from your
> > inbox.
> >
> >
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL
> :ON:WL:en-US:WM_HMP:042010_1
> >
> 
> 
> 
> --
> Todd Lipcon
> Software Engineer, Cloudera

RE: speed up reads in hBase

Reply via email to