Also HBase is not a key-value store. With an ordered index you can retrieve successive rows, whereas true key value stores don't promise any relation to the 'next' key if there is even such an option. Eg: Berkeley DB. And yes I know about OrderedPreservingPartitioner but I've also heard it's a bad idea to use it :-)
-ryan On Tue, May 4, 2010 at 10:18 AM, TuX RaceR <tuxrace...@gmail.com> wrote: > Thanks a lot Gary: I had missed this one > cheers > TuX > > > Gary Helmling wrote: >> >> You can always add a PageFilter to your Scan instance to achieve this: >> >> http://hadoop.apache.org/hbase/docs/r0.20.3/api/org/apache/hadoop/hbase/filter/PageFilter.html >> >> Just be aware that you should still count on the client side if you want >> to >> strictly limit to a given size. Since the filter is applied independently >> on each regionserver, the client can still receive back more than the page >> size # of items. >> >> --gh >> >> >> On Tue, May 4, 2010 at 12:02 PM, TuX RaceR <tuxrace...@gmail.com> wrote: >> >> >>> >>> Hi Hbase users, >>> >>> question related to the previous one, if we want to limit the amount of >>> data retrieved by a a scanner, can we tell to not scan after a number of >>> rows is reached? >>> If I look at another KV store (cassandra) the equivalent of the scan API >>> uses there a >>> >>> >>> KeyRange >>> >>> object, see >>> http://wiki.apache.org/cassandra/API >>> >>> *Attribute* >>> >>> >>> >>> *Type* >>> >>> >>> >>> *Default* >>> >>> >>> >>> *Required* >>> >>> >>> >>> *Description* >>> >>> start_key >>> >>> >>> >>> string >>> >>> >>> >>> n/a >>> >>> >>> >>> N >>> >>> >>> >>> The first key in the inclusive KeyRange. >>> >>> end_key >>> >>> >>> >>> string >>> >>> >>> >>> n/a >>> >>> >>> >>> N >>> >>> >>> >>> The last key in the inclusive KeyRange. >>> >>> start_token >>> >>> >>> >>> string >>> >>> >>> >>> n/a >>> >>> >>> >>> N >>> >>> >>> >>> The first token in the exclusive KeyRange. >>> >>> end_token >>> >>> >>> >>> string >>> >>> >>> >>> n/a >>> >>> >>> >>> N >>> >>> >>> >>> The last token in the exclusive KeyRange. >>> >>> count >>> >>> >>> >>> i32 >>> >>> >>> >>> 100 >>> >>> >>> >>> Y >>> >>> >>> >>> The total number of keys to permit in the KeyRange. >>> >>> >>> Would it be useful (performance wise) to have a 'count' parameter >>> too, or would it be useless as equivalent to end the scan loop >>> application side, when the desired number of row is reached? >>> >>> >>> >>> Thanks >>> >>> >>> TuX >>> >>> >>> >>> >> >> > >