[
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036449#comment-13036449
]
Marquis Wang commented on HIVE-2036:
------------------------------------
Making notes on how to do this:
One of the difficult/different parts about using bitmap indexes is that the
only time they become useful is when multiple indexes are combined. Thus, you
need a query that joins the various bitmap index tables and returns the blocks
that contain the rows we want.
Thus the two parts to writing the automatic use index handler for bitmap
indexes are:
1. Figuring out what indexes to use:
As mentioned above, you may need to extend the IndexPredicateAnalyzer to
support ORs and possibly to return a tree of predicates (I don't think it
already does this).
2. Building a query that accesses the index tables:
This is an example query that I know works for querying the index tables in the
query
{noformat}
SELECT * FROM lineitem WHERE L_QUANTITY = 50.0 AND L_DISCOUNT = 0.08 AND L_TAX
= 0.01;
{noformat}
{noformat}
SELECT bucketname AS `_bucketname`, COLLECT_SET(offset) as `_offsets`
FROM (SELECT
`_bucketname` AS bucketname, `_offset` AS offset
FROM
(SELECT ab.`_bucketname`, ab.`_offset`, EWAH_BITMAP_AND(ab.bitmap,
c.`_bitmaps`) as bitmap FROM
(SELECT a.`_bucketname`, b.`_offset`, EWAH_BITMAP_AND(a.`_bitmaps`,
b.`_bitmaps`) as bitmap FROM
(SELECT * FROM default__lineitem_quantity__ WHERE L_QUANTITY =
50.0) a JOIN
(SELECT * FROM default__lineitem_discount__ WHERE L_DISCOUNT =
0.08) b
ON a.`_bucketname` = b.`_bucketname` AND a.`_offset` =
b.`_offset`) ab JOIN
(SELECT * FROM default__lineitem_tax__ WHERE L_TAX = 0.01) c
ON ab.`_bucketname` = c.`_bucketname` AND ab.`_offset` =
c.`_offset`) abc
WHERE
NOT EWAH_BITMAP_EMPTY(abc.bitmap)
) t
GROUP BY bucketname;
{noformat}
This format is perfect for joining any number of AND predicates. I'm pretty
sure you can figure out how to expand them to include OR predicates and
different grounping of predicates as well. If you make any changes/extensions
to the format you should be sure to test them to make sure they have the
performance characteristics you want.
> Update bitmap indexes for automatic usage
> -----------------------------------------
>
> Key: HIVE-2036
> URL: https://issues.apache.org/jira/browse/HIVE-2036
> Project: Hive
> Issue Type: Improvement
> Components: Indexing
> Affects Versions: 0.8.0
> Reporter: Russell Melick
> Assignee: Jeffrey Lym
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap
> index support. The bitmap code will need to be extended after it is
> committed to enable automatic use of indexing. Most work will be focused in
> the BitmapIndexHandler, which needs to generate the re-entrant QL index
> query. There may also be significant work in the IndexPredicateAnalyzer to
> support predicates with OR's, instead of just AND's as it is currently.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira