[
https://issues.apache.org/jira/browse/PHOENIX-971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326480#comment-14326480
]
Nick Dimiduk commented on PHOENIX-971:
--------------------------------------
My experiments show the shared Phoenix connection is the bottleneck for a query
server. It seems a single process will share the underly ExecutorService across
all PhoenixDriver instances, even if multiple JDBC connections are used. This
is consistent with the advised usage pattern for HBase clients. The single pool
is used by both the HTable instance and Phoenix, so the query parallelism
quickly saturates even an extremely large pool. For example, a table with 255
salt partitions will result in PARALLEL 255-WAY SCAN for many simple queries,
which results in 255 Phoenix work jobs submitted to the pool. The same pool is
used by the HBase connection for managing open sockets with RS's, AsyncProcess
submissions, &c. For a query server to support 100's of concurrent clients, and
a moderately sized HBase cluster (say, 50 nodes, 10k regions), the underlying
pool will be servicing 100's of thousands to millions of work requests, most of
which are mostly doing or waiting on IO. I fear context switching/thread
management will become a burden before the real work.
Those of you know know the HBase client threading model better than I, does
this sound right?
> Query server
> ------------
>
> Key: PHOENIX-971
> URL: https://issues.apache.org/jira/browse/PHOENIX-971
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Andrew Purtell
> Assignee: Nick Dimiduk
> Fix For: 5.0.0
>
>
> Host the JDBC driver in a query server process that can be deployed as a
> middle tier between lighter weight clients and Phoenix+HBase. This would
> serve a similar optional role in Phoenix deployments as the
> [HiveServer2|https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2]
> does in Hive deploys.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)