[ https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381446#comment-14381446 ]
ramkrishna.s.vasudevan commented on HBASE-12790: ------------------------------------------------ bq.So is 'grouping' a pure phoenix construct or is 'grouping' just a client identifier? I would say it is a client identifier. It may not be phoenix specific. bq.Or does a single phoenix client run multiple groups? Single phoenix client - I am not getting your point here. When you say client you mean every Phoenix client will generate a fixed group Id? Every query can be split into parallel scans. And each query can set a unique Scan Id. I think Phoenix does it with a random ID. (Correct me if am wrong here [~giacomotaylor].) bq.I ask because this scheduling of rpc in the server is a hot spot when I profile We could do a profile to say how much it is. I think the Phoenix team would already have some profiling information on this area. > Support fairness across parallelized scans > ------------------------------------------ > > Key: HBASE-12790 > URL: https://issues.apache.org/jira/browse/HBASE-12790 > Project: HBase > Issue Type: New Feature > Reporter: James Taylor > Assignee: ramkrishna.s.vasudevan > Labels: Phoenix > Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, > HBASE-12790_1.patch > > > Some HBase clients parallelize the execution of a scan to reduce latency in > getting back results. This can lead to starvation with a loaded cluster and > interleaved scans, since the RPC queue will be ordered and processed on a > FIFO basis. For example, if there are two clients, A & B that submit largish > scans at the same time. Say each scan is broken down into 100 scans by the > client (broken down into equal depth chunks along the row key), and the 100 > scans of client A are queued first, followed immediately by the 100 scans of > client B. In this case, client B will be starved out of getting any results > back until the scans for client A complete. > One solution to this is to use the attached AbstractRoundRobinQueue instead > of the standard FIFO queue. The queue to be used could be (maybe it already > is) configurable based on a new config parameter. Using this queue would > require the client to have the same identifier for all of the 100 parallel > scans that represent a single logical scan from the clients point of view. > With this information, the round robin queue would pick off a task from the > queue in a round robin fashion (instead of a strictly FIFO manner) to prevent > starvation over interleaved parallelized scans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)