[jira] [Commented] (HIVE-1644) use filter pushdown for automatically accessing indexes

John Sichi (JIRA) Thu, 28 Apr 2011 19:30:46 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026813#comment-13026813
 ]


John Sichi commented on HIVE-1644:
----------------------------------

OK, I dug into this and found out what's going on.

As you mentioned in the conf call, the order of operations in 
SemanticAnalyzer.genMapRedTasks is such that physical optimization happens 
after GenMRTableScan1.  So the code in GenMRTableScan1 is totally irrelevant 
and can be removed.

You are setting the input format and intermediate file on the correct work 
object already inside of IndexWhereProcessor.

What's going wrong is that the test is using MapRedTask instead of its 
superclass ExecDriver.  And MapRedTask is missing the code to propagate the 
attributes from the work into the job conf.  So we need to make this code from 
ExecDriver into a helper method setInputAttributes:

{noformat}
    if (work.getInputformat() != null) {
      HiveConf.setVar(job, HiveConf.ConfVars.HIVEINPUTFORMAT, 
work.getInputformat());
    }
    if (work.getIndexIntermediateFile() != null) {
      job.set("hive.index.compact.file", work.getIndexIntermediateFile());
    }
{noformat}

and then invoke setInputAttributes from within MapRedTask.execute, just before 
the "// enable assertion" comment.

When I do this, then I can see the correct input format and intermediate file 
being set on the spawned job.  (Speaking of the intermediate file, can we get 
rid of /tmp/index_banana?  :)

The test passes with or without this change, indicating there could still be 
some other problem (since the point of the test is to demonstrate different 
behavior when the index is being used).  However, I'm not sure about the test 
itself since it is now using a range condition where before it was using an 
equality condition, and block-level indexing means a block could contain the 
extra values as long as a single value (47 in this case) is hit by the index.  
But you're using text files for some reason, and I still don't know exactly how 
the "blocks" work there.


> use filter pushdown for automatically accessing indexes
> -------------------------------------------------------
>
>                 Key: HIVE-1644
>                 URL: https://issues.apache.org/jira/browse/HIVE-1644
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: John Sichi
>            Assignee: Russell Melick
>         Attachments: HIVE-1644.1.patch, HIVE-1644.10.patch, 
> HIVE-1644.11.patch, HIVE-1644.12.patch, HIVE-1644.13.patch, 
> HIVE-1644.14.patch, HIVE-1644.15.patch, HIVE-1644.16.patch, 
> HIVE-1644.17.patch, HIVE-1644.2.patch, HIVE-1644.3.patch, HIVE-1644.4.patch, 
> HIVE-1644.5.patch, HIVE-1644.6.patch, HIVE-1644.7.patch, HIVE-1644.8.patch, 
> HIVE-1644.9.patch
>
>
> HIVE-1226 provides utilities for analyzing filters which have been pushed 
> down to a table scan.  The next step is to use these for selecting available 
> indexes and generating access plans for those indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1644) use filter pushdown for automatically accessing indexes

Reply via email to