Unless the row is read from disk, how can one know its not the one you want? This is true for any db system, relational dbs can hide the extra reads better.
Hbase doesn't provide any query language, so the full cost is realized and apparent. Server side filters can help reduce network io, but ultimately you'll need to build secondary indexes if this becomes a primary use case with high volume. If its analysis, typically people just throw a map reduce at it and call it a day. Good luck! On Apr 11, 2009 9:34 AM, "Lars George" <l...@worldlingo.com> wrote: Hi Vincent, What I did is also have a custom getSplits() implementation in the TableInputFormat. When the splits are determined I mask out those regions that have no key of interest. Since the start and end key are ordered as a total list I can safely assume that if I scan the last few thousand entries that I can skip the regions beforehand. Of course, if you have a complete random key or the rows are spread across every region then this is futile. Lars Vincent Poon (vinpoon) wrote: > > Thanks for the reply. I have been using ColumnValueFilter, but ...