[ 
https://issues.apache.org/jira/browse/HDFS-11486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11486:
-----------------------------------
    Attachment: HDFS-11486.001.patch

Good idea [~iwasakims].
Reopened this jira and attach a new patch that includes both my test and 
[~linyiqun]'s test.

I verified both tests passed after HDFS-11499 and both failed if I revert 
HDFS-11499. Updated Yiqun's test slightly to reuse existing test API and 
objects.

> Client close() should not fail fast if the last block is being decommissioned
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-11486
>                 URL: https://issues.apache.org/jira/browse/HDFS-11486
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDF-11486.test.patch, HDFS-11486.001.patch, 
> HDFS-11486.test-inmaintenance.patch
>
>
> If a DFS client closes a file while the last block is being decommissioned, 
> the close() may fail if the decommission of the block does not complete in a 
> few seconds.
> When a DataNode is being decommissioned, NameNode marks the DN's state as 
> DECOMMISSION_INPROGRESS_INPROGRESS, and blocks with replicas on these 
> DataNodes become under-replicated immediately. A close() call which attempts 
> to complete the last open block will fail if the number of live replicas is 
> below minimal replicated factor, due to too many replicas residing on the 
> DataNodes.
> The client internally will try to complete the last open block for up to 5 
> times by default, which is roughly 12 seconds. After that, close() throws an 
> exception like the following, which is typically not handled properly.
> {noformat}
> java.io.IOException: Unable to close file because the last 
> blockBP-33575088-10.0.0.200-1488410554081:blk_1073741827_1003 does not have 
> enough number of replicas.
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:864)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:827)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:793)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>       at 
> org.apache.hadoop.hdfs.TestDecommission.testCloseWhileDecommission(TestDecommission.java:708)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> Once the exception is thrown, the client usually does not attempt to close 
> again, so the file remains in open state, and the last block remains in under 
> replicated state.
> Subsequently, administrator runs recoverLease tool to salvage the file, but 
> the attempt failed because the block remains in under replicated state. It is 
> not clear why the block is never replicated though. However, administrators 
> think it becomes a corrupt file because the file remains open via fsck 
> -openforwrite and the file modification time is hours ago.
> In summary, I do not think close() should fail because the last block is 
> being decommissioned. The block has sufficient number replicas, and it's just 
> that some replicas are being decommissioned. Decomm should be transparent to 
> clients.
> This issue seems to be more prominent on a very large scale cluster, with min 
> replication factor set to 2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to