CR Hota created HDFS-14090:
------------------------------

             Summary: RBF: Improved isolation for downstream name nodes.
                 Key: HDFS-14090
                 URL: https://issues.apache.org/jira/browse/HDFS-14090
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: CR Hota
            Assignee: CR Hota


Router is a gateway to underlying name nodes. Gateway architectures, should 
help minimize impact of clients connecting to healthy clusters vs unhealthy 
clusters.

For example - If there are 2 name nodes downstream, and one of them is heavily 
loaded with calls spiking rpc queue times, due to back pressure the same with 
start reflecting on the router. As a result of this, clients connecting to 
healthy/faster name nodes will also slow down as same rpc queue is maintained 
for all calls at the router layer. Essentially the same IPC thread pool is used 
by router to connect to all name nodes.

Currently router uses one single rpc queue for all calls. Lets discuss how we 
can change the architecture and add some throttling logic for 
unhealthy/slow/overloaded name nodes.

One way could be to read from current call queue, immediately identify 
downstream name node and maintain a separate queue for each underlying name 
node. Another simpler way is to maintain some sort of rate limiter configured 
for each name node and let routers drop/reject/send error requests after 
certain threshold. 

This won’t be a simple change as router’s ‘Server’ layer would need redesign 
and implementation. Currently this layer is the same as name node.

Opening this ticket to discuss, design and implement this feature.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to