Any one has suggestions for the performance issue discussed in this thread?. Your suggestions would help me resolve this issue.
Infrastructure details: Azure HDInsight HBase Type Node Size Cores Nodes Head D3 V2 8 2 Region D3 V2 16 4 ZooKeeper D3 V2 12 3 Thanks, Sasikumar Natarajan. On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan <[email protected]> wrote: > Also its not only the first time it takes time when we call > ResultSet.next(). > > When we iterate over ResultSet, it takes a long time initially and then > iterates faster. Again after few iterations, it takes sometime and this > goes on. > > > > Sample observation: > > > > Total Rows available on ResultSet : 5130 > > Statement.executeQuery() has taken : 702 ms > > ResultSet Indices at which long time has been taken : *0* (7965 ms), > *2041* (7155 ms), *4126 *(1630 ms) > > On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan <[email protected]> > wrote: > >> Hi Ankit, >> Where does the server processing happens, on the HBase cluster >> or the server where Phoenix core runs. >> >> PFB the details you have asked for, >> >> Query: >> >> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where >> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7 >> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 >> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and >> to_date('2016-08-05 23:59:59.000'))) >> >> >> Explain plan: >> >> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL >> ['MK00100','YOU',4] >> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7 >> >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05 >> 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <= >> TIMESTAMP '2016-08-05 23:59:59.000'))) >> DDL: >> >> CREATE TABLE IF NOT EXISTS SPL_FINAL >> (col1 VARCHAR NOT NULL, >> col2 VARCHAR NOT NULL, >> col3 INTEGER NOT NULL, >> col4 INTEGER NOT NULL, >> col5 VARCHAR NOT NULL, >> col6 VARCHAR NOT NULL, >> col7 TIMESTAMP NOT NULL, >> col8 TIMESTAMP NOT NULL, >> ext.col9 VARCHAR, >> ext.col10 VARCHAR, >> pri.col11 VARCHAR[], //this column contains 3600 items in every row >> pri.col12 VARCHAR >> ext.col13 BOOLEAN >> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6, >> col7, col8)) COMPRESSION='SNAPPY'; >> >> Thanks, >> Sasikumar Natarajan. >> >> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal <[email protected] >> > wrote: >> >>> Share some more details about the query, DDL and explain plan. In >>> Phoenix, there are cases where we do some server processing at the time >>> when rs.next() is called first time but subsequent next() should be faster. >>> >>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan <[email protected]> >>> wrote: >>> >>>> Hi, >>>> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the >>>> data available on Phoenix server. >>>> >>>> preparedStatement.executeQuery() seems to be taking less time. But to >>>> enter into *while (rs.next()) {} *takes a long time. I would like to >>>> know what is causing the delay to make the ResultSet ready. Please share >>>> your thoughts on this. >>>> >>>> >>>> -- >>>> Regards, >>>> Sasikumar Natarajan >>>> >>> >>> >> >> >> -- >> Regards, >> Sasikumar Natarajan >> > > > > -- > Regards, > Sasikumar Natarajan > -- Regards, Sasikumar Natarajan
