[ https://issues.apache.org/jira/browse/HDFS-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099434#comment-17099434 ]
Wei-Chiu Chuang commented on HDFS-13183: ---------------------------------------- I am really sorry I meant to review but got distracted. I would like to push this feature to the finish line, because CRFS is a big feature and will take time to stabilize. Plus, it requires an additional Observer NameNode. The logistics of adding an extra master namenode adds additional complexity. A few comments on the patch: * does it work in federated cluster? IIRC you have a large federated cluster so I am assuming the answer is yes, but does work out of box or does it require extra configuration ? (Sorry, don't have much experience with HDFS federation) * Looks like the balancer determine which NN is the sbnn at start, and then use it til the end. There are two issues: ** failover. if a failover happens, the balancer can't adapt and will then send the requests to ANN. That is fine as it shouldn't fail the balancer, but it increases the new ANN overhead. ** multiple standby namenode support. The balancer always choose the first available standby namenode. This is fine, since in any case there can be only one balancer running at a time. Also, just want to say that you don't actually need to UNCHECKED FSNamesystem#getBlocks(). If dfs.ha.allow.stale.reads is true, Standby NN accepts the request as well. That is an extra configuration so probably not ideal. > Standby NameNode process getBlocks request to reduce Active load > ---------------------------------------------------------------- > > Key: HDFS-13183 > URL: https://issues.apache.org/jira/browse/HDFS-13183 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, namenode > Reporter: Xiaoqiao He > Assignee: Xiaoqiao He > Priority: Major > Attachments: HDFS-13183-trunk.001.patch, HDFS-13183-trunk.002.patch, > HDFS-13183-trunk.003.patch, HDFS-13183.004.patch, HDFS-13183.005.patch > > > The performance of Active NameNode could be impact when {{Balancer}} requests > #getBlocks, since query blocks of overly full DNs performance is extremely > inefficient currently. The main reason is {{NameNodeRpcServer#getBlocks}} > hold read lock for long time. In extreme case, all handlers of Active > NameNode RPC server are occupied by one reader > {{NameNodeRpcServer#getBlocks}} and other write operation calls, thus Active > NameNode enter a state of false death for number of seconds even for minutes. > The similar performance concerns of Balancer have reported by HDFS-9412, > HDFS-7967, etc. > If Standby NameNode can shoulder #getBlocks heavy burden, it could speed up > the progress of balancing and reduce performance impact to Active NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org