[
https://issues.apache.org/jira/browse/HBASE-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Gray updated HBASE-1481:
---------------------------------
Attachment: HBASE-1481-v1.patch
Patch adds a new filter called FirstKeyOnlyFilter. It's extremely simple, but
this does generally accomplish what we want.
The only further optimizations to row counting I can think of:
- prevent sending back even an entire KV per row (all we really need is the
count, but this breaks the API)
- once we work at issues like HBASE-1517, we should seek to the next row after
we look at the first KV (if we have a million columns in a row, we don't need
to iterate all of them to do a row count)
The latter issue gets me thinking about what filters could do to push that kind
of information to the QueryMatcher....
> Add fast row key only scanning
> ------------------------------
>
> Key: HBASE-1481
> URL: https://issues.apache.org/jira/browse/HBASE-1481
> Project: Hadoop HBase
> Issue Type: Improvement
> Affects Versions: 0.19.3
> Reporter: Lars George
> Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HBASE-1481-v1.patch
>
>
> Instead of requiring a user to set up a scanner with any column and scan the
> table to gather all row keys while ignoring the column value we should have a
> fast and lightweight scanner that for example takes a "null" for the column
> list and then simply returns only the matching keys of all non-empty or
> deleted rows. Filters should still be applicable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.