[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909682#comment-16909682
 ] 

He Xiaoqiao commented on HDFS-14090:
------------------------------------

Thanks [~crh] for your contributions and pings. Actually this feature has been 
used in our test env for a while. It runs very well in the most scenarios. I 
would like to offer some required features for 
{{StaticFairnessPolicyController}} in our case, as [~xkrogen] said above, when 
we configure a constant allocation count permit, it can not tune dynamically 
after Router startup unless reconfig and reboot Router process, however loads 
of different namespaces are changing at any time, So do we need some admin 
interface to change the allocation count permit dynamically? what furthermore, 
I believe we should add one controller which could allocated count permit 
automatically based on the current namespace's load, maybe named 
{{DynamicalFairnessPolicyController}} vs {{StaticFairnessPolicyController}}.
+1(no-binding) for [^HDFS-14090.010.patch] from my side. Consider this is very 
useful feature and many guys are waiting for this patch be ready as far as I 
know, In my opinion we should push this patch forward then continue to extend 
{{FairnessPolicyController}} and offer some other more choices in the next 
phase. Pending other guys feedback. Thanks [~crh] again.

> RBF: Improved isolation for downstream name nodes.
> --------------------------------------------------
>
>                 Key: HDFS-14090
>                 URL: https://issues.apache.org/jira/browse/HDFS-14090
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: CR Hota
>            Assignee: CR Hota
>            Priority: Major
>         Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, RBF_ Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to