As Maryann said, secondary index would help a lot in your case. Beside that if your HBase cluster consists of at-least couple of region servers and the data is distributed, then stats <http://phoenix.apache.org/update_statistics.html> available in latest Phoenix release should get better performance.
On Tue, Nov 25, 2014 at 12:30 PM, Maryann Xue <maryann....@gmail.com> wrote: > Your query time should be the sum of running the following two individual > queries: > 1. SELECT MAX("Records"."timestamp") FROM "History" where "rowId" like ? > 2. select "Records"."Operation", "Records"."status", "Records"."timestamp" > from "History" where "Records"."timestamp"=? > > Without secondary indexing on column "timestamp", the first query will be > full-scan with aggregation and the second will be a full-scan with a filter > on the "timestamp" column. > > Generally the higher parallel level (which depends on how many nodes you > have in your cluster and how many regions you have for the target table) > the better performance you get. > > In this case, you could also have an secondary index on "timestamp" column > to avoid a full-table scan for the second query. > > > Thanks, > Maryann > > On Tue, Nov 25, 2014 at 1:26 AM, Ahmed Hussien <aahussi...@gmail.com> > wrote: > > > Good Day to you guys, > > > > I have a db that contains about 33 million records per table (4 tables). > > when i execute the following queries it takes to much time to retrieve > > about 5 miniutes per query. > > > > 1- select "Records"."Operation", "Records"."status", > "Records"."timestamp" > > from "History" where "Records"."timestamp"=(SELECT > > MAX("Records"."timestamp") FROM "History" where "rowId" like ?) > > > > can you help me adjust the performance? > > --- > > Ahmed > > > > > > > -- > Thanks, > Maryann >