Thanks Peter. It helped. That property combined with setting the property 'hive.optimize.index.filter' to 'true' got the index working.
thanks, Thilina On Mon, Feb 3, 2014 at 6:12 PM, Peter Marron < peter.mar...@trilliumsoftware.com> wrote: > Hi, > > > > Not sure if it is relevant to your problem but I'm just checking > > that you know about > > hive.optimize.index.filter.compact.minsize > > it's set to 5Gbytes by default and if the estimated query size is > > less than this then the index won't be used. > > HTH. > > > > Regards > > > > *Peter Marron* > > Senior Developer, Research & Development > > > > Office: +44 *(0) 118-940-7609* peter.mar...@trilliumsoftware.com > > Theale Court First Floor, 11-13 High Street, Theale, RG7 5AH, UK > > <https://www.facebook.com/pages/Trillium-Software/109184815778307> > > <https://twitter.com/TrilliumSW> > > <http://www.linkedin.com/company/17710> > > > > *www.trilliumsoftware.com <http://www.trilliumsoftware.com/>* > > Be Certain About Your Data. Be Trillium Certain. > > > > *From:* Thilina Gunarathne [mailto:cset...@gmail.com] > *Sent:* 03 February 2014 16:08 > *To:* user > *Subject:* Index not getting used for the queries > > > > Dear all, > > I created a compact index for a table with several hundred million records > as follows. The table is partitioned by the month. The index on A and B was > created successfully, but I can't see it getting used in the queries. It > would be great if one of you experts can shed some light on what am I > missing. I'm using hive 0.9. > > set hive.exec.parallel=false; > CREATE INDEX idx_yyyy > ON TABLE yyyy(a,b) > AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' > WITH DEFERRED REBUILD > COMMENT 'Index for yyyy table. Indexing on A and B'; > ALTER INDEX idx_yyyy on yyyy REBUILD; > > > hive> describe yyyy; > OK > a bigint > ... > b bigint > .... > > month int > > > > hive> show index on yyyy; > OK > idx_yyyy yyyy a, b > default__yyyy_p_idx_yyyy__ compact Index for tm top50 > table. Indexing on A and B > > > hive> explain select a,b from tm_top50_p where a=113231 and > month=201308; > OK > ABSTRACT SYNTAX TREE: > (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME yyyy))) (TOK_INSERT > (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR > (TOK_TABLE_OR_COL a)) (TOK_SELEXPR (TOK_TABLE_OR_COL b))) (TOK_WHERE (and > (= (TOK_TABLE_OR_COL a) 113231) (= (TOK_TABLE_OR_COL month) 201308))))) > > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 is a root stage > > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Alias -> Map Operator Tree: > yyyy > TableScan > alias: yyyy > Filter Operator > predicate: > expr: (a = 113231) > type: boolean > Select Operator > expressions: > expr: a > type: bigint > expr: b > type: bigint > outputColumnNames: _col0, _col1 > File Output Operator > compressed: false > GlobalTableId: 0 > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > > Stage: Stage-0 > Fetch Operator > limit: -1 > > thanks a lot, > Thilina > > > -- > https://www.cs.indiana.edu/~tgunarat/ > http://www.linkedin.com/in/thilina > > http://thilina.gunarathne.org > -- https://www.cs.indiana.edu/~tgunarat/ http://www.linkedin.com/in/thilina http://thilina.gunarathne.org
<<image001.png>>
<<image002.png>>
<<image004.png>>
<<image003.png>>