Hi, I want to test the use of indexes in hive. For this I created anindex, I launched a first query above, I changed the settings on Hive and ran my query on my database table:
CREATE INDEX index ON TABLE table(key) as 'COMPACT' WITH DEFERRED REBUILD; ALTER INDEX index ON table REBUILD; INSERT OVERWRITE DIRECTORY "/tmp/index-result2" SELECT `_bucketname` , `_offsets` FROM default__table_index__ x WHERE x.key=100; SET hive.index.compact.file=/tmp/index_result2; SET hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat; SELECT key, col2, sum(col3) FROM tableWHERE key=100 group BY col2; No error but I am not sure that the index is really used. Indeed I get the same performance with and without the index. When I look at the logs (tasktracker, datanode, job_config), I see no call to the directory / tmp / index_result2. How do I know if my index has been really used? Thanks for your help, -- Guillaume WEILL
