Hello,

When I use index in hive 1.2.1, I find the index does not work.  The details 
are as follows:

1. After using index, the query speed does not improve.  If I use manual use of 
indexes, the query speed improve obviously, but when switch to automatic use of 
indexes, the speed makes no difference relative to not use index.

2. After rebuild index, I add a new text file which includes one record 
matching my query filter in the table directory. Then,  the query results will 
show the record included in the new text file. (The case that append new record 
in the same file but in different block is the same.)

3.When debug the hive source code I find that the function  generateIndexQuery 
of class CompactIndexHandler is't called. Finally I find that the function 
compile in class TaskCompiler returns early at the follow statements:
if (pCtx.getFetchTask() != null) {
  return;
}this will result in index not working for query. But I do't know why to set 
FetchTask because  I know little about 
hive.--------------------------------------------------------------------------------------------------------So,
 My question is :1. Does hive 1.2.1 support index normally? IF it supports 
index completely, what's my issue?2. I want to know  how indexes are used to 
optimize queries, where can I find some 
references?--------------------------------------------------------------------------------------------------------Appendix:
 How do I use index in hive 1.2.11.create table and load data:create table 
table01( id int, name string)  
ROW FORMAT DELIMITED  
FIELDS TERMINATED BY '\t';  
load data local inpath '/home/hadoop/data/dual.txt' overwrite into table 
table01;2.create and rebuild index:create index table01_index on table 
table01(id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' 
with deferred rebuild;
alter index table01_index on table01 rebuild;3.set properties:set 
hive.optimize.index.filter.compact.minsize=0;
set hive.optimize.index.filter.compact.maxsize=-1;
set hive.index.compact.query.max.size=-1;
set hive.index.compact.query.max.entries=-1;
set Hive.optimize.index.groupby=false;
set hive.optimize.index.filter=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;4.execute 
query statement:select * from table01 where id =500000;
Thanks!


Jason

Reply via email to