[ https://issues.apache.org/jira/browse/HDFS-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell reassigned HDFS-14626: ---------------------------------------- Assignee: (was: Stephen O'Donnell) > Decommission all nodes hosting last block of open file succeeds unexpectedly > ----------------------------------------------------------------------------- > > Key: HDFS-14626 > URL: https://issues.apache.org/jira/browse/HDFS-14626 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Priority: Major > Attachments: test-to-reproduce.patch > > > I have been investigating scenarios that cause decommission to hang, > especially around one long standing issue. That is, an open block on the host > which is being decommissioned can cause the process to never complete. > Checking the history, there seems to have been at least one change in > HDFS-5579 which greatly improved the situation, but from reading comments and > support cases, there still seems to be some scenarios where open blocks on a > DN host cause the decommission to get stuck. > No matter what I try, I have not been able to reproduce this, but I think I > have uncovered another issue that may partly explain why. > If I do the following, the nodes will decommission without any issues: > 1. Create a file and write to it so it crosses a block boundary. Then there > is one complete block and one under construction block. Keep the file open, > and write a few bytes periodically. > 2. Now note the nodes which the UC block is currently being written on, and > decommission them all. > 3. The decommission should succeed. > 4. Now attempt to close the open file, and it will fail to close with an > error like below, probably as decommissioned nodes are not allowed to send > IBRs: > {code:java} > java.io.IOException: Unable to close file because the last block > BP-646926902-192.168.0.20-1562099323291:blk_1073741827_1003 does not have > enough number of replicas. > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:968) > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:911) > at > org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:894) > at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:849) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101){code} > Interestingly, if you recommission the nodes without restarting them before > closing the file, it will close OK, and writes to it can continue even once > decommission has completed. > I don't think this is expected - ie decommission should not complete on all > nodes hosting the last UC block of a file? > From what I have figured out, I don't think UC blocks are considered in the > DatanodeAdminManager at all. This is because the original list of blocks it > cares about, are taken from the Datanode block Iterator, which takes them > from the DatanodeStorageInfo objects attached to the datanode instance. I > believe UC blocks don't make it into the DatanodeStoreageInfo until after > they have been completed and an IBR sent, so the decommission logic never > considers them. > What troubles me about this explanation, is how did open files previously > cause decommission to get stuck if it never checks for them, so I suspect I > am missing something. > I will attach a patch with a test case that demonstrates this issue. This > reproduces on trunk and I also tested on CDH 5.8.1, which is based on the 2.6 > branch, but with a lot of backports. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org