[
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010285#comment-13010285
]
He Yongqiang commented on HIVE-1644:
------------------------------------
Hi Russell,
FIL%SEL% maybe not not good enough, how about a TBL%FIL?
Also just had an offline talk with Namit. Namit proposed some very good ideas
for this task:
1. check index exists or not. For a query on partitioned tables, index
optimizer should try to find out indexes do exists on all partitions which the
original task is scanning. This information can be found in ParseContext's
OpToPartList.
2. add more parameters to config whether to use the index or not. (like if the
filter is a >, not use the index. size of inputs is bigger than some value, not
use index)
3. In case the index is not good (like even after scanning the index, it still
needs to scan the whole base table), just do not use it, and go back to scan
the whole base table. This can be done by adding a conditional task and a
backup task. And how to detecting the index is good or not can be done by
monitoring the index job's number of input records and number of output
records, and compare them. let's say that if the ratio is >50, do not use the
index. Kill the index job, and go back to scanning the whole base table. 3) can
be done in a followup jira if you want.
> use filter pushdown for automatically accessing indexes
> -------------------------------------------------------
>
> Key: HIVE-1644
> URL: https://issues.apache.org/jira/browse/HIVE-1644
> Project: Hive
> Issue Type: Improvement
> Components: Indexing
> Affects Versions: 0.7.0
> Reporter: John Sichi
> Assignee: Russell Melick
> Attachments: HIVE-1644.1.patch, HIVE-1644.10.patch,
> HIVE-1644.2.patch, HIVE-1644.3.patch, HIVE-1644.4.patch, HIVE-1644.5.patch,
> HIVE-1644.6.patch, HIVE-1644.7.patch, HIVE-1644.8.patch, HIVE-1644.9.patch
>
>
> HIVE-1226 provides utilities for analyzing filters which have been pushed
> down to a table scan. The next step is to use these for selecting available
> indexes and generating access plans for those indexes.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira