[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei Yan updated HADOOP-15016: ----------------------------- Summary: Cost-Based RPC FairCallQueue with Reservation support (was: Add reservation support to RPC FairCallQueue) > Cost-Based RPC FairCallQueue with Reservation support > ----------------------------------------------------- > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Wei Yan > Assignee: Wei Yan > Attachments: Adding reservation support to NameNode RPC resource.pdf, > Adding reservation support to NameNode RPC resource_v2.pdf, > HADOOP-15016_poc.patch > > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org