Unless the row is read from disk, how can one know its not the one you want?
This is true for any db system, relational dbs can hide the extra reads
better.

Hbase doesn't provide any query language, so the full cost is realized and
apparent. Server side filters can help reduce network io, but ultimately
you'll need to build secondary indexes if this becomes a primary use case
with high volume. If its analysis, typically people just throw a map reduce
at it and call it a day.

Good luck!

On Apr 11, 2009 9:34 AM, "Lars George" <l...@worldlingo.com> wrote:

Hi Vincent,

What I did is also have a custom getSplits() implementation in the
TableInputFormat. When the splits are determined I mask out those regions
that have no key of interest. Since the start and end key are ordered as a
total list I can safely assume that if I scan the last few thousand entries
that I can skip the regions beforehand. Of course, if you have a complete
random key or the rows are spread across every region then this is futile.

Lars

Vincent Poon (vinpoon) wrote: > > Thanks for the reply.  I have been using
ColumnValueFilter, but ...

Reply via email to