Any one has suggestions for the performance issue discussed in this
thread?. Your suggestions would help me resolve this issue.

Infrastructure details:

Azure HDInsight HBase

Type Node Size        Cores       Nodes
Head D3 V2 8 2
Region D3 V2 16 4
ZooKeeper D3 V2 12 3
Thanks,
Sasikumar Natarajan.


On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan <sasi...@gmail.com>
wrote:

> Also its not only the first time it takes time when we call
> ResultSet.next().
>
> When we iterate over ResultSet, it takes a long time initially and then
> iterates faster. Again after few iterations, it takes sometime and this
> goes on.
>
>
>
> Sample observation:
>
>
>
> Total Rows available on ResultSet : 5130
>
> Statement.executeQuery() has taken : 702 ms
>
> ResultSet Indices at which long time has been taken : *0*  (7965 ms),
> *2041* (7155 ms), *4126 *(1630 ms)
>
> On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan <sasi...@gmail.com>
> wrote:
>
>> Hi Ankit,
>>            Where does the server processing happens, on the HBase cluster
>> or the server where Phoenix core runs.
>>
>> PFB the details you have asked for,
>>
>> Query:
>>
>> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where
>> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7
>> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05
>> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and
>> to_date('2016-08-05 23:59:59.000')))
>>
>>
>> Explain plan:
>>
>> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL
>> ['MK00100','YOU',4]
>>     SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7
>> >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05
>> 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <=
>> TIMESTAMP '2016-08-05 23:59:59.000')))
>> DDL:
>>
>> CREATE TABLE IF NOT EXISTS SPL_FINAL
>> (col1 VARCHAR NOT NULL,
>> col2 VARCHAR NOT NULL,
>> col3 INTEGER NOT NULL,
>> col4 INTEGER NOT NULL,
>> col5 VARCHAR NOT NULL,
>> col6 VARCHAR NOT NULL,
>> col7 TIMESTAMP NOT NULL,
>> col8 TIMESTAMP NOT NULL,
>> ext.col9 VARCHAR,
>> ext.col10 VARCHAR,
>> pri.col11 VARCHAR[], //this column contains 3600 items in every row
>> pri.col12 VARCHAR
>> ext.col13 BOOLEAN
>> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6,
>> col7, col8)) COMPRESSION='SNAPPY';
>>
>> Thanks,
>> Sasikumar Natarajan.
>>
>> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal <ankitsingha...@gmail.com
>> > wrote:
>>
>>> Share some more details about the query, DDL and explain plan. In
>>> Phoenix, there are cases where we do some server processing at the time
>>> when rs.next() is called first time but subsequent next() should be faster.
>>>
>>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan <sasi...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>     I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
>>>> data available on Phoenix server.
>>>>
>>>> preparedStatement.executeQuery()  seems to be taking less time. But to
>>>> enter into *while (rs.next()) {} *takes a long time. I would like to
>>>> know what is causing the delay to make the ResultSet ready. Please share
>>>> your thoughts on this.
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Sasikumar Natarajan
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Sasikumar Natarajan
>>
>
>
>
> --
> Regards,
> Sasikumar Natarajan
>



-- 
Regards,
Sasikumar Natarajan

Reply via email to