why cant you reduce your query to select msbo1.PARENTID from msbo_phoenix_comp_rowkey where msbo1.PARENTTYPE = 'SHIPMENT' and msbo1.OWNERORGID = 100 and msbo1.MILESTONETYPEID != 19661 and msbo1.PARENTREFERENCETIME between 1479964000 and 1480464000 group by msbo1.PARENTID order by msbo1.PARENTID
?? looks like that's what you get by this left outer. *Cheers !!* Arvind On 22 February 2017 at 22:58, Pradheep Shanmugam < [email protected]> wrote: > Hi, > > We have a hbase cluster with 8 region servers with 20G memory > We have a table with 1 column family along with a secondary index. > Following query took only few milliseconds when we had less data(< 1 > million) > After adding more data(~30M rows) the performance declined and took about > a minute or more(not stable) > > select msbo1.PARENTID > from msbo_phoenix_comp_rowkey msbo1 > left outer join ( > select PARENTID,MILESTONETYPEID > from msbo_phoenix_comp_rowkey > where PARENTREFERENCETIME between 1479964000 and 1480464000 > <01480%20464%20000> > and OWNERORGID = 100 > and PARENTTYPE = 'SHIPMENT' > and MILESTONETYPEID = 19661 > group by PARENTID,MILESTONETYPEID > ) msbo2 > on msbo1.PARENTID = msbo2.PARENTID > where msbo1.PARENTTYPE = 'SHIPMENT' > and msbo1.OWNERORGID = 100 > and msbo2.MILESTONETYPEID is null > and msbo1.PARENTREFERENCETIME between 1479964000 and 1480464000 > <01480%20464%20000> > group by msbo1.PARENTID > order by msbo1.PARENTID > > The RHS return about a 500K rows ..LHS about 18M rows…final result about > 500K rows > > MSBO_PHOENIX_COMP_ROWKEY_INDEX_PARENTTYPE_OWNERORGID_MILESTONETYPEID_PARENTREFERENCETIME > is the index > > *Query plan:* > CLIENT 8-CHUNK PARALLEL 8-WAY RANGE SCAN OVER MSBO_PHOENIX_COMP_ROWKEY_ > INDEX_PARENTTYPE_OWNERORGID_MILESTONETYPEID_PARENTREFERENCETIME > [0,'SHIPMENT',100] > SERVER FILTER BY FIRST KEY ONLY AND > (TO_UNSIGNED_LONG("PARENTREFERENCETIME") > >= 1477958400 AND TO_UNSIGNED_LONG("PARENTREFERENCETIME") <= 1480464000 > <01480%20464%20000>) > SERVER AGGREGATE INTO DISTINCT ROWS BY ["MSBO1.:PARENTID"] > CLIENT MERGE SORT > PARALLEL LEFT-JOIN TABLE 0 > CLIENT 8-CHUNK PARALLEL 8-WAY RANGE SCAN OVER > MSBO_PHOENIX_COMP_ROWKEY_INDEX_PARENTTYPE_OWNERORGID_MILESTONETYPEID_PARENTREFERENCETIME > [0,'SHIPMENT',100,19661,1,477,958,400] - [0,'SHIPMENT',100,19661,1,480, > 464,000] > SERVER FILTER BY FIRST KEY ONLY > SERVER AGGREGATE INTO DISTINCT ROWS BY ["PARENTID", > "MILESTONETYPEID"] > CLIENT MERGE SORT > AFTER-JOIN SERVER FILTER BY MSBO2.MILESTONETYPEID IS NULL > > Attached the phoenix log. > I see the caching to set as 100..and "maxResultSize”:2097152..is that > something that can be tuned will help? > Is that the client merge sort consuming more time can be improved? Is > there any other tuning possible? > > Thanks, > Pradheep >
