[ 
https://issues.apache.org/jira/browse/HBASE-29141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Umesh Kumar Kumawat updated HBASE-29141:
----------------------------------------
    Description: 
*TLDR* 

In hbase for SimpleRPCScheduler, we have call queues and handlers. For each 
queue there are some handler assigned. We have a config, 
{{hbase.ipc.server.max.callqueue.length}} to control the maximum length of the 
queue. If we don't define this config HBase tries calculate default maxLength 
for the queue using {{{}DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER{}}}. But there 
was one issue here each queue was assigned to a portion of handler but while 
calculating the max queue length we were using the count for all the handlers.

*Description using configs and code links*

_____________________________________________________________________________________________________________________

Regarding the handler and queues HBase have these config. 
 * {{hbase.regionserver.handler.count}}  => Number of handlers, Default is 30.
 * {{hbase.ipc.server.callqueue.handler.factor}}  => Queue-to-handler ratio. 
Default it is 0.1. Means 10 handler per queue.
 ** {{number of call queues}} = {{handlerCount}} * 
{{callQueuesHandlersFactor(0.1)}}  
([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java#L195])
 * {{hbase.ipc.server.max.callqueue.length}}  => Max callqueue Length. Default 
is {{DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER(10) * {color:#ff0000}*total 
handlerCount*{color}}}  
([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java#L71])
 ** {*}Issue{*}: I think there’s a small issue here. We are currently treating 
this as the length per queue, but it's calculated using the {_}total handler 
count{_}, not the number of handlers per queue.

_____________________________________________________________________________________________________________________

*Other concerns*

In codel queue type, we have a config 
{{hbase.ipc.server.callqueue.codel.lifo.threshold}} (default 0.8). Having 100 
handlers, by default all the 10 queues will be of size 1000 and queue 
processing will not change to lifo unless there are 800 calls waiitng in a 
single queue. Don't you think it is too much.

 
 # I believe we may have unintentionally created a dependency between the 
*total number of handlers* and the call queue length. Should we reconsider this 
logic?
 # Regarding the hard limit: should we be enforcing a _minimum_ value for the 
queue size or a _maximum_ value for the queue size ?

  was:
*TLDR* 

In hbase for SimpleRPCScheduler, we have call queues and handlers. For each 
queue there are some handler assigned. We have a config, 
{{hbase.ipc.server.max.callqueue.length}} to control the maximum length of the 
queue. If we don't define this config HBase tries calculate default maxLength 
for the queue using {{{}DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER{}}}. But there 
was one issue here each queue was assigned to a portion of handler but while 
calculating the max queue length we were using the count for all the handlers.

*Description using configs and code links*

_____________________________________________________________________________________________________________________

Regarding the handler and queues HBase have these config. 
 * {{hbase.regionserver.handler.count}}  => Number of handlers, Default is 30.
 * {{hbase.ipc.server.callqueue.handler.factor}}  => Queue-to-handler ratio. 
Default it is 0.1. Means 10 handler per queue.
 ** {{number of call queues}} = {{handlerCount}} * 
{{callQueuesHandlersFactor(0.1)}}  
([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java#L195])
 * {{hbase.ipc.server.max.callqueue.length}}  => Max callqueue Length. Default 
is {{DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER(10) * {color:#ff0000}*total 
handlerCount*{color}}}  
([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java#L71])
 ** {*}Issue{*}: I think there’s a small issue here. We are currently treating 
this as the length per queue, but it's calculated using the {_}total handler 
count{_}, not the number of handlers per queue.
 * In addition we also have a hard limit of 
({{{}DEFAULT_CALL_QUEUE_SIZE_HARD_LIMIT{}}}) 250 
([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java#L228])
 this is hard limit on *minimum* value of queue size. Queue size can not be 
less then 250.

_____________________________________________________________________________________________________________________

*Other concerns*

In codel queue type, we have a config 
{{hbase.ipc.server.callqueue.codel.lifo.threshold}} (default 0.8). Having 100 
handlers, by default all the 10 queues will be of size 1000 and queue 
processing will not change to lifo unless there are 800 calls waiitng in a 
single queue. Don't you think it is too much.

 
 # I believe we may have unintentionally created a dependency between the 
*total number of handlers* and the call queue length. Should we reconsider this 
logic?
 # Regarding the hard limit: should we be enforcing a _minimum_ value for the 
queue size or a _maximum_ value for the queue size ?


> Default maxQueueLength used for call queues is too high
> -------------------------------------------------------
>
>                 Key: HBASE-29141
>                 URL: https://issues.apache.org/jira/browse/HBASE-29141
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 3.0.0-beta-1, 2.5.11, 2.6.2
>            Reporter: Umesh Kumar Kumawat
>            Assignee: Umesh Kumar Kumawat
>            Priority: Major
>              Labels: pull-request-available
>
> *TLDR* 
> In hbase for SimpleRPCScheduler, we have call queues and handlers. For each 
> queue there are some handler assigned. We have a config, 
> {{hbase.ipc.server.max.callqueue.length}} to control the maximum length of 
> the queue. If we don't define this config HBase tries calculate default 
> maxLength for the queue using 
> {{{}DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER{}}}. But there was one issue 
> here each queue was assigned to a portion of handler but while calculating 
> the max queue length we were using the count for all the handlers.
> *Description using configs and code links*
> _____________________________________________________________________________________________________________________
> Regarding the handler and queues HBase have these config. 
>  * {{hbase.regionserver.handler.count}}  => Number of handlers, Default is 30.
>  * {{hbase.ipc.server.callqueue.handler.factor}}  => Queue-to-handler ratio. 
> Default it is 0.1. Means 10 handler per queue.
>  ** {{number of call queues}} = {{handlerCount}} * 
> {{callQueuesHandlersFactor(0.1)}}  
> ([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java#L195])
>  * {{hbase.ipc.server.max.callqueue.length}}  => Max callqueue Length. 
> Default is {{DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER(10) * 
> {color:#ff0000}*total handlerCount*{color}}}  
> ([code|https://github.com/apache/hbase/blob/777010361abb203b8b17673d84acf4f7f1d0283a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java#L71])
>  ** {*}Issue{*}: I think there’s a small issue here. We are currently 
> treating this as the length per queue, but it's calculated using the {_}total 
> handler count{_}, not the number of handlers per queue.
> _____________________________________________________________________________________________________________________
> *Other concerns*
> In codel queue type, we have a config 
> {{hbase.ipc.server.callqueue.codel.lifo.threshold}} (default 0.8). Having 100 
> handlers, by default all the 10 queues will be of size 1000 and queue 
> processing will not change to lifo unless there are 800 calls waiitng in a 
> single queue. Don't you think it is too much.
>  
>  # I believe we may have unintentionally created a dependency between the 
> *total number of handlers* and the call queue length. Should we reconsider 
> this logic?
>  # Regarding the hard limit: should we be enforcing a _minimum_ value for the 
> queue size or a _maximum_ value for the queue size ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to