Hey Ted:

See bin/HBase.rb.  Look at the count method.  See how it uses a filter
called FirstKeyOnlyFilter.  This will return after it finds a value on a
row, the first value.

Looking at the arguments you can pass a Scan, it looks like you can pass a
FILTER argument only I see there is a bug in that FILTER is not defined.  Do
this to fix it:

Index: bin/HBase.rb
===================================================================
--- bin/HBase.rb        (revision 889094)
+++ bin/HBase.rb        (working copy)
@@ -44,6 +44,7 @@
   METHOD = "METHOD"
   MAXLENGTH = "MAXLENGTH"
   CACHE_BLOCKS = "CACHE_BLOCKS"
+  FILTER = "FILTER"

   # Wrapper for org.apache.hadoop.hbase.client.HBaseAdmin
   class Admin



Now, to scan and get unique rows only, you could do following in shell
(after making above change):

HBase Shell; enter 'help<RETURN>' for list of supported commands.
Version: 0.21.0-dev, r889026, Thu Dec 10 05:27:10 UTC 2009
hbase(main):002:0> f = FirstKeyOnlyFilter.new()
hbase(main):003:0> scan 'TestTable', {FILTER => f}
# i.e. make an instance of this filter and then pass it to the scan

Will this work for you?

St.Ack





On Thu, Dec 17, 2009 at 2:44 PM, Ted Yu <[email protected]> wrote:

> Can you outline how such command can be added ?
>
> Thanks
>
> On Thu, Dec 17, 2009 at 11:06 AM, stack <[email protected]> wrote:
>
>> On Tue, Dec 15, 2009 at 12:59 PM, Ted Yu <[email protected]> wrote:
>>
>> > That works.
>> >
>> > scan command gives values for columns.
>> > Is there a shell command which lists unique row values, such as
>> > 'com.onsoft.www:http/' ?
>> >
>> >
>> > If you mean a command to list rows only, there is not such a command
>> (Wouldn't be hard to add).
>> St.Ack
>>
>
>

Reply via email to