Chance Li created HBASE-20303: --------------------------------- Summary: RS RPC server should not allow the response queue size to be too large Key: HBASE-20303 URL: https://issues.apache.org/jira/browse/HBASE-20303 Project: HBase Issue Type: Improvement Components: rpc Environment: 2000+ Region Servers Reporter: Chance Li Assignee: Chance Li Fix For: 3.0.0
With async clients, in some scenarios RS RPC server will occur Full GC because of the large reponse queue. Netty provides a WriteBufferHighWaterMark on channel, But this doesn't meet RS's needs(Consider 5k ~ 10k sockets in one RS server, it will need 50G ~ 100G heap for 10M per channel). RS rpc server will add a gloabl response buffer water mark(2G by default). when reaching the throttle, RS will not serve any request. And RS rpc server will add a water mark for channel (100M by default), because it's mostly possible that this client is abnomal. We created a unit test to simulate abnormal case: a client that has only 1 socket can lead RS server to occupy heap up to 100M. Notes: 1. For client compability, we still use existed exception(CALL_QUEUE_TOO_BIG_EXCEPTION) but error message is different. 2. Not for SimpleRpcServer, because It's rarely used. -- This message was sent by Atlassian JIRA (v7.6.3#76005)