[
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026813#comment-13026813
]
John Sichi commented on HIVE-1644:
----------------------------------
OK, I dug into this and found out what's going on.
As you mentioned in the conf call, the order of operations in
SemanticAnalyzer.genMapRedTasks is such that physical optimization happens
after GenMRTableScan1. So the code in GenMRTableScan1 is totally irrelevant
and can be removed.
You are setting the input format and intermediate file on the correct work
object already inside of IndexWhereProcessor.
What's going wrong is that the test is using MapRedTask instead of its
superclass ExecDriver. And MapRedTask is missing the code to propagate the
attributes from the work into the job conf. So we need to make this code from
ExecDriver into a helper method setInputAttributes:
{noformat}
if (work.getInputformat() != null) {
HiveConf.setVar(job, HiveConf.ConfVars.HIVEINPUTFORMAT,
work.getInputformat());
}
if (work.getIndexIntermediateFile() != null) {
job.set("hive.index.compact.file", work.getIndexIntermediateFile());
}
{noformat}
and then invoke setInputAttributes from within MapRedTask.execute, just before
the "// enable assertion" comment.
When I do this, then I can see the correct input format and intermediate file
being set on the spawned job. (Speaking of the intermediate file, can we get
rid of /tmp/index_banana? :)
The test passes with or without this change, indicating there could still be
some other problem (since the point of the test is to demonstrate different
behavior when the index is being used). However, I'm not sure about the test
itself since it is now using a range condition where before it was using an
equality condition, and block-level indexing means a block could contain the
extra values as long as a single value (47 in this case) is hit by the index.
But you're using text files for some reason, and I still don't know exactly how
the "blocks" work there.
> use filter pushdown for automatically accessing indexes
> -------------------------------------------------------
>
> Key: HIVE-1644
> URL: https://issues.apache.org/jira/browse/HIVE-1644
> Project: Hive
> Issue Type: Improvement
> Components: Indexing
> Affects Versions: 0.8.0
> Reporter: John Sichi
> Assignee: Russell Melick
> Attachments: HIVE-1644.1.patch, HIVE-1644.10.patch,
> HIVE-1644.11.patch, HIVE-1644.12.patch, HIVE-1644.13.patch,
> HIVE-1644.14.patch, HIVE-1644.15.patch, HIVE-1644.16.patch,
> HIVE-1644.17.patch, HIVE-1644.2.patch, HIVE-1644.3.patch, HIVE-1644.4.patch,
> HIVE-1644.5.patch, HIVE-1644.6.patch, HIVE-1644.7.patch, HIVE-1644.8.patch,
> HIVE-1644.9.patch
>
>
> HIVE-1226 provides utilities for analyzing filters which have been pushed
> down to a table scan. The next step is to use these for selecting available
> indexes and generating access plans for those indexes.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira