"SELECT StartIp, EndIp, Country FROM ipv4geotable” should have been rewritten as a scan against index table.
BitMap Indexes seems to support inequalities (<=, <, >=). Post the explain plan. On 6/26/15, 8:56 PM, "Gopal Vijayaraghavan" <gop...@apache.org> wrote: >Hi, > >Hive indexes won¹t really help you speed up that query right now, because >of the plan it generates due to the <= clauses. > >> CREATETABLE ipv4table >> AS >> SELECT logon.IP, ipv4.Country >> FROM >> (SELECT * FROM logontable WHERE isIpv4(IP)) logon >> LEFT OUTER JOIN >> (SELECT StartIp, EndIp, Country FROM ipv4geotable) ipv4 ON >> isIpv4(logon.IP) >> WHERE ipv4.StartIp <=logon.IP AND logon.IP <= ipv4.EndIp; > >That¹s a cross-product join, which can¹t be distributed at all & will take >forever, even if you use Tez/LLAP. > >Range join queries have partial distribution rewrites, but AFAIK none of >them use filter indexes in hive. > >But before I suggest a rewrite, can you post the ³explain <query>;² for >the above query, so that I can check if this is indeed producing a >cross-product + filter? > >Cheers, >Gopal > > > > > > > >