Hi Sergey,

Thanks for your response..I got the issue..
Could you please throw some light on the Phoenix thread and Hconnections? Is it 
like 1 phoenix thread creates 30 Hconnections .
Assuming we are running 30 concurrent reads queries which would need 30 * 30 = 
900 Hconnections but I guess we have only 256 Hconnections (is there a 
parameter to tweak this) will the parallel scan be converted to serial scans 
reducing query performance?
Does increasing the phoenix.query.threadPoolSize (from 128 to 256) also 
increase the Hconnections thread pool size and how much)?
Should we definitely consider reducing the salting(from 30 to 10)  considering 
this huge multiplication factor of number of threads? Initially we decided to 
use salt buckets of 30 as this is write heave table and total number of rows 
could grow up to several billions.

Thanks,
Pradheep

From: <sergey.solda...@gmail.com> on behalf of Sergey Soldatov 
<sergeysolda...@gmail.com>
Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org>
Date: Tuesday, May 22, 2018 at 1:45 PM
To: "user@phoenix.apache.org" <user@phoenix.apache.org>
Subject: Re: Phoenix Client threads

Salting byte is calculated using a hash function for the whole row key (using 
all pk columns). So if you are using only one of PK columns in the WHERE 
clause, Phoenix is unable to identify which salting byte (bucket number) should 
be used, so it runs scans for all salting bytes.  All those threads are 
lightweight, mostly waiting for a response from HBase server, so you may 
consider the option to adjust nproc limit. Or you may decrease the number of 
phoenix threads by phoenix.query.threadPoolSize property. Decreasing number of 
salting buckets can be used as well.

Thanks,
Sergey

On Tue, May 22, 2018 at 8:52 AM, Pradheep Shanmugam 
<pradheep.shanmu...@infor.com<mailto:pradheep.shanmu...@infor.com>> wrote:
Hi,

We have table with key as (type, id1, id2) (type is same for all rows where as 
id1 and id2 are unique for each row) which is salted (30 salt buckets)
The load on this table is about 30 queries/sec with each query taking ~6ms
we are using phoenix 4.7.0 non-thin client
we have query like below

SELECT tab.a, tab.b
FROM tab
WHERE tab.id1 = '1F64F5DY0J0A03692'
AND tab.type = 4
AND tab.isActive = 1;

CLIENT 30-CHUNK 0 ROWS 0 BYTES PARALLEL 30-WAY ROUND ROBIN RANGE SCAN OVER TAB 
[0,4, '1F64F5DY0J0A03692']
    SERVER FILTER BY TS.ISACTIVE = 1

Here I could see that about 30 threads are being used for this query..here 
‘type’ is same for all rows..and thought that it is the reason for looking into 
all the chunks to get the key and hence using 30 threads

Then I ran the same query on a similar table with keys rearranged (id1, id2, 
type) and salted (30)

But still I see same 30 threads are being used , thought it can uniquely 
identify a row with given id1 which should be in one of the chunks (is this due 
to salting that it does not know where the keys is)

CLIENT 30-CHUNK PARALLEL 30-WAY ROUND ROBIN RANGE SCAN OVER TAB [0, 
'1F64F5DY0J0A03692']
    SERVER FILTER BY (TYPE = 4 AND TS.ISACTIVE = 1)

Currently I am exceeding my nproc limit set in my app server with (phoenix 
threads 128 and hconnection threads reaching 256 = 384 threads). Can you please 
throw some light on phoenix connections and Hconnections  and how to reduce 
that to reasonable level..and also on the above query plans. Should we consider 
reducing the SALT Number to 10( we have 10 region servers)?

Thanks,
Pradheep

Reply via email to