[ https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369782#comment-14369782 ]
ramkrishna.s.vasudevan commented on HBASE-12790: ------------------------------------------------ Based on @[~giacomotaylor] AbstractRoundrobinqueue implementation, tried out patch that helps to establish the fairness. Thanks to James for helping on this. Currently the RPC scheduler is FIFO based when there is no deadline. As the description of the JIRA suggests that we would like to establish a round robin nature so that the scans submitted parallely need not wait for the other to complete before it gets scheduled. In order to establish this fairness in scan queries we would allow the scan to set some grouping Id - a string. A new scheduling policy needs to be used if we need this scheduler that works based on grouping. (this is based on a configuration). Here again we have two cases. CALL_QUEUE_TYPE_FIFO_CONF_VALUE and CALL_QUEUE_TYPE_DEADLINE_CONF_VALUE For CALL_QUEUE_TYPE_FIFO_CONF_VALUE we will use the AbstractRoundRobinQueue which would group the calls based on the groupId set in the scan query and the calls gets scheduled in a round robin manner based on FIFO order. The AbstractRoundrobinqueue is the one which Phoenix currently uses to schedule the parallel scans. For CALL_QUEUE_TYPE_DEADLINE_CONF_VALUE case we have AbstractRoundRobinPriorityQueue which would sort the calls based on the priority (here the deadline) and on that it would group the calls based on the groupId set in the scan query. So the round robin happens with the group of same priority. The configuration that enables the Scheduler to work based on this grouping is 'hbase.ipc.server.callqueue.grouping'. Set this to true for this feature. By default it is false. Just attaching a patch for reference. Ideas and suggestions are welcome. > Support fairness across parallelized scans > ------------------------------------------ > > Key: HBASE-12790 > URL: https://issues.apache.org/jira/browse/HBASE-12790 > Project: HBase > Issue Type: New Feature > Reporter: James Taylor > Labels: Phoenix > Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch > > > Some HBase clients parallelize the execution of a scan to reduce latency in > getting back results. This can lead to starvation with a loaded cluster and > interleaved scans, since the RPC queue will be ordered and processed on a > FIFO basis. For example, if there are two clients, A & B that submit largish > scans at the same time. Say each scan is broken down into 100 scans by the > client (broken down into equal depth chunks along the row key), and the 100 > scans of client A are queued first, followed immediately by the 100 scans of > client B. In this case, client B will be starved out of getting any results > back until the scans for client A complete. > One solution to this is to use the attached AbstractRoundRobinQueue instead > of the standard FIFO queue. The queue to be used could be (maybe it already > is) configurable based on a new config parameter. Using this queue would > require the client to have the same identifier for all of the 100 parallel > scans that represent a single logical scan from the clients point of view. > With this information, the round robin queue would pick off a task from the > queue in a round robin fashion (instead of a strictly FIFO manner) to prevent > starvation over interleaved parallelized scans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)