[ https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230612#comment-17230612 ]
Yiqun Lin edited comment on HDFS-14090 at 11/12/20, 2:38 PM: ------------------------------------------------------------- Hi [~fengnanli], three nits for the latest patch: 1 Will look good to rename dfs.federation.router.fairness.handler.count.NS to dfs.federation.router.fairness.handler.count.EXAMPLENAMESERVICE. 2 {noformat} smaller or equal to the total number of router handlers; if the special *concurrent* is not specified, the sum of all configured values must be strictly smaller than the router handlers thus the left will be allocated to the concurrent calls. {noformat} Can we mention related setting ''strictly smaller than the router handlers (dfs.federation.router.handler.count)... 3 Can you fix related failed unit test? |hadoop.hdfs.server.federation.router.TestRBFConfigFields| Others look good to me. was (Author: linyiqun): Hi [~fengnanli], two nits for the latest patch: {noformat} smaller or equal to the total number of router handlers; if the special *concurrent* is not specified, the sum of all configured values must be strictly smaller than the router handlers thus the left will be allocated to the concurrent calls. {noformat} Can we mention related setting ''strictly smaller than the router handlers (dfs.federation.router.handler.count)... Can you fix related failed unit test? |hadoop.hdfs.server.federation.router.TestRBFConfigFields| Others look good to me. > RBF: Improved isolation for downstream name nodes. {Static} > ----------------------------------------------------------- > > Key: HDFS-14090 > URL: https://issues.apache.org/jira/browse/HDFS-14090 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: CR Hota > Assignee: Fengnan Li > Priority: Major > Attachments: HDFS-14090-HDFS-13891.001.patch, > HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, > HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, > HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, > HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, > HDFS-14090.012.patch, HDFS-14090.013.patch, HDFS-14090.014.patch, > HDFS-14090.015.patch, HDFS-14090.016.patch, HDFS-14090.017.patch, > HDFS-14090.018.patch, HDFS-14090.019.patch, HDFS-14090.020.patch, > HDFS-14090.021.patch, HDFS-14090.022.patch, HDFS-14090.023.patch, > HDFS-14090.024.patch, RBF_ Isolation design.pdf > > > Router is a gateway to underlying name nodes. Gateway architectures, should > help minimize impact of clients connecting to healthy clusters vs unhealthy > clusters. > For example - If there are 2 name nodes downstream, and one of them is > heavily loaded with calls spiking rpc queue times, due to back pressure the > same with start reflecting on the router. As a result of this, clients > connecting to healthy/faster name nodes will also slow down as same rpc queue > is maintained for all calls at the router layer. Essentially the same IPC > thread pool is used by router to connect to all name nodes. > Currently router uses one single rpc queue for all calls. Lets discuss how we > can change the architecture and add some throttling logic for > unhealthy/slow/overloaded name nodes. > One way could be to read from current call queue, immediately identify > downstream name node and maintain a separate queue for each underlying name > node. Another simpler way is to maintain some sort of rate limiter configured > for each name node and let routers drop/reject/send error requests after > certain threshold. > This won’t be a simple change as router’s ‘Server’ layer would need redesign > and implementation. Currently this layer is the same as name node. > Opening this ticket to discuss, design and implement this feature. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org