Hi,

I'd like to know what's the current status of indexing in hive. What I've
found so far is that the user has to manually set the index table for each
query. Sth like this:

******************************************************
insert overwrite directory "/tmp/index_result" select `_bucketname` ,
`_offsets` from src_rc_index where key=0;

set hive.exec.index_file=/tmp/index_result;

//use a new index file format to prune inputsplit based on the offset list
//stored in "hive.exec.index_file" which is populated in previous command
 set
hive.input.format=org.apache.hadoop.hive.ql.index.io.HiveIndexInputFormat;

//this query will not scan the whole base data
 select key, value from src_rc where key=0;
*******************************************************

Is there any automatic plan generation that can make use of the existing
indices in the 0.7.1 release or any patch available that can do that?

Thanks,
Avrilia


Reply via email to