[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952001#comment-16952001 ]
David Mollitor edited comment on HDFS-14854 at 10/15/19 3:04 PM: ----------------------------------------------------------------- {code:java} private void processPendingNodes() { while (!pendingNodes.isEmpty() && (maxConcurrentTrackedNodes == 0 || outOfServiceNodeBlocks.size() < maxConcurrentTrackedNodes)) { outOfServiceNodeBlocks.put(pendingNodes.poll(), null); } } {code} This method is accessed by the local running Thread. However, {{pendingNodes}} does not appear to be a thread-safe Collection. Perhaps the collection cannot be modified because of the external locking of the {{writeLock}} but there is no requirement to have the lock stated in the {{startTrackingNode}} method javadoc. was (Author: belugabehr): {code:java} private void processPendingNodes() { while (!pendingNodes.isEmpty() && (maxConcurrentTrackedNodes == 0 || outOfServiceNodeBlocks.size() < maxConcurrentTrackedNodes)) { outOfServiceNodeBlocks.put(pendingNodes.poll(), null); } } {code} This method is accessed by the local running Thread. However, {{pendingNodes}} does not appear to be a thread-safe class. Perhaps the collection cannot be modified because of the external locking of the {{writeLock}} but there is no requirement to have the lock stated in the {{startTrackingNode}} method. > Create improved decommission monitor implementation > --------------------------------------------------- > > Key: HDFS-14854 > URL: https://issues.apache.org/jira/browse/HDFS-14854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, > HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, > HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, > HDFS-14854.008.patch > > > In HDFS-13157, we discovered a series of problems with the current > decommission monitor implementation, such as: > * Blocks are replicated sequentially disk by disk and node by node, and > hence the load is not spread well across the cluster > * Adding a node for decommission can cause the namenode write lock to be > held for a long time. > * Decommissioning nodes floods the replication queue and under replicated > blocks from a future node or disk failure may way for a long time before they > are replicated. > * Blocks pending replication are checked many times under a write lock > before they are sufficiently replicate, wasting resources > In this Jira I propose to create a new implementation of the decommission > monitor that resolves these issues. As it will be difficult to prove one > implementation is better than another, the new implementation can be enabled > or disabled giving the option of the existing implementation or the new one. > I will attach a pdf with some more details on the design and then a version 1 > patch shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org