RE: Improving HBase scanner

Michelan Arendse Wed, 05 May 2010 00:21:45 -0700

I don't know what the row start and end keys are - they GUID keys (improves 
writes across cluster - had help with this from this user-group before).
I need to export data written between "startDate" and "endDate" into a 
relational database so I can interrogate the data (SUM/AVG, etc).

That is why I am are using: scan.setTimeRange(fromDate.getTime(), 
toDate.getTime());
In my test with live data, I only took between 2010-03-26 00:00:00 and 
2010-03-26 01:00:00 - there should only be a few thousand rows in-between those 
dates.

Will hbase still take forever to find the data I look for unless I use 
startRow/endRow?

-----Original Message-----
From: TuX RaceR [mailto:tuxrace...@gmail.com] 
Sent: 04 May 2010 05:52 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Improving HBase scanner

Michelan Arendse wrote:
> Is there a way to speed up the fetching of data from HBase?
>
>   

Divide your key space in smaller chunks?
using closer |startRow, and ||stopRow?|
|*cf: 
<http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/client/Scan.html#Scan%28byte%5B%5D,%20byte%5B%5D%29>

Scan 
<http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/client/Scan.html#Scan%28byte%5B%5D,%20byte%5B%5D%29>*(byte[]
 startRow, 
byte[] stopRow)|

TuX

RE: Improving HBase scanner

Reply via email to