[ https://issues.apache.org/jira/browse/PHOENIX-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani updated PHOENIX-7370: ---------------------------------- Priority: Critical (was: Major) > Server to server system table RPC calls should use separate RPC handler pool > ---------------------------------------------------------------------------- > > Key: PHOENIX-7370 > URL: https://issues.apache.org/jira/browse/PHOENIX-7370 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 5.2.0 > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Critical > > HBase uses RPC (Remote Procedure Call) framework for all the wire > communication among its components e.g. client to server (client to master > daemon or client to regionservers) as well as server to server (master to > regionserver, regionserver to regionserver) communication. HBase RPC uses > Google's Protocol Buffers (protobuf) for defining the structure of messages > sent between clients and servers. Protocol Buffers allow efficient > serialization and deserialization of data, which is crucial for performance. > HBase defines service interfaces using Protocol Buffers, which outline the > operations that clients can request from HBase servers. These interfaces > define methods like get, put, scan, etc., that clients use to interact with > the database. > HBase also provides Coprocessors. HBase Coprocessors are used to extend the > regionservers functionalities. They allow custom code to execute within the > context of the regionserver during specific phases of the given workflow, > such as during data reads (preScan, postScan etc), writes (preBatchMutate, > postBatchMutate etc), region splits or even at the start or end of > regionserver operations. In addition to being SQL query engine, Phoenix is > also a Coprocessor component. RPC framework using Protobuf is used to define > how coprocessor endpoints communicate between clients and the coprocessors > running on the regionservers. > Phoenix client creates CQSI connection ({{{}ConnectionQueryServices{}}}), > which maintains long time TCP connection with HBase server, usually knowns as > {{HConnection}} or HBase Connection. Once the connection is created, it is > cached by the Phoenix client. > While PHOENIX-6066 is considered the correct fix to improve the query > performance, releasing it has surfaced other issues related to RPC framework. > One of the issues surfaced caused deadlock for SYSTEM.CATALOG serving > regionserver as it could not make any more progress because all handler > threads serving RPC calls for Phoenix system tables (thread pool: > {{{}RpcServer.Metadata.Fifo.handler{}}}) got exhausted while creating server > side connection from the given regionserver. > Several workflows from MetaDataEndpointImpl coproc requires Phoenix > connection, which is usually CQSI connection. Phoenix differentiates CQSI > connections initiated by clients and servers by using a property: > {{{}IS_SERVER_CONNECTION{}}}. > For CQSI connections created by servers, IS_SERVER_CONNECTION is kept true. > Under heavy load, when several clients execute getTable() calls for the same > base table simultaneously, MetaDataEndpointImpl coproc attempts to create > server side CQSI connection initially. As CQSI initialization also depends on > Phoenix system tables existence check as well as client to server version > compatibility checks, it also performs MetaDataEndpointImpl#getVersion() RPC > call which is meant to be served by RpcServer.Metadata.Fifo.handler > thread-pool. However, under heavy load, the thread-pool can be completely > occupied if all getTable() calls tries to initiate CQSI connection, whereas > only single thread can take global CQSI lock to initiate HBase Connection > before caching CQSI connection for other threads to use. This has potential > to create deadlock. > h3. Solutions: > * Phoenix server to server system table RPC calls are supposed to be using > separate handler thread-pools (PHOENIX-6687). However, this is not correctly > working because regardless of whether the HBase Connection is initiated by > client or server, Phoenix only provides ClientRpcControllerFactory by > default. We need to provide separate RpcControllerFactory during HBase > Connection initialization done by Coprocessors that operate on regionservers. > * For Phoenix server creating CQSI connection, we do not need to check for > existence of system tables as well as client-server version compatibility. > This redundant RPC call can be avoided. > > Doc on HBase/Phoenix RPC Scheduler Framework: > https://docs.google.com/document/d/12SzcAY3mJVsN0naMnq45qsHcUIk1CzHsAI0EOi6IIgg/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.20.10#820010)