[jira] Updated: (HBASE-1481) Add fast row key only scanning

Jonathan Gray (JIRA) Thu, 01 Oct 2009 05:21:52 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jonathan Gray updated HBASE-1481:
---------------------------------

    Attachment: HBASE-1481-v1.patch

Patch adds a new filter called FirstKeyOnlyFilter.  It's extremely simple, but 
this does generally accomplish what we want.

The only further optimizations to row counting I can think of:

- prevent sending back even an entire KV per row (all we really need is the 
count, but this breaks the API)
- once we work at issues like HBASE-1517, we should seek to the next row after 
we look at the first KV (if we have a million columns in a row, we don't need 
to iterate all of them to do a row count)

The latter issue gets me thinking about what filters could do to push that kind 
of information to the QueryMatcher....

> Add fast row key only scanning
> ------------------------------
>
>                 Key: HBASE-1481
>                 URL: https://issues.apache.org/jira/browse/HBASE-1481
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.19.3
>            Reporter: Lars George
>            Priority: Minor
>             Fix For: 0.21.0
>
>         Attachments: HBASE-1481-v1.patch
>
>
> Instead of requiring a user to set up a scanner with any column and scan the 
> table to gather all row keys while ignoring the column value we should have a 
> fast and lightweight scanner that for example takes a "null" for the column 
> list and then simply returns only the matching keys of all non-empty or 
> deleted rows. Filters should still be applicable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1481) Add fast row key only scanning

Reply via email to