Todd Lipcon created HBASE-6014:
----------------------------------

             Summary: Support for block-granularity bitmap indexes
                 Key: HBASE-6014
                 URL: https://issues.apache.org/jira/browse/HBASE-6014
             Project: HBase
          Issue Type: New Feature
          Components: regionserver
            Reporter: Todd Lipcon


This came up in a discussion with Kannan today, so I promised to write 
something brief on JIRA -- this was suggested as a potential summer intern 
project. The idea is as follows:

We have several customers who periodically run full table scan MR jobs against 
large HBase tables while applying fairly restrictive predicates. The predicates 
are often reasonably simple boolean expressions across known columns, and those 
columns often are enum-typed or otherwise have a fairly restricted range of 
values. For example, a real time process may mark rows as dirty, and a 
background MR job may scan for dirty rows in order to perform further 
processing like rebuilding inverted indexes.

One way to speed up this type of query is to add bitmap indexes. In the context 
of HBase, I would envision this as a new type of metadata block included in the 
HFile which has a series of tuples: (qualifier, value range, compressed 
bitmap). A 1 bit in the bitmap indicates that the corresponding HFile block has 
at least one cell for which a column with the given qualifier falls within the 
given range. Queries which have an equality or comparison predicate against an 
indexed qualifier can then use the bitmap index to seek directly to those 
blocks which may contain relevant data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to