[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster

Uma Maheswara Rao G (JIRA) Wed, 13 Mar 2013 22:04:16 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602035#comment-13602035
 ]


Uma Maheswara Rao G commented on HDFS-4600:
-------------------------------------------

Yeah, here behaviour is different for normal write and append flow with 
respective to this policy I think.
This particular policy check happens only in setupPipelineForAppendOrRecovery.

If initially pipeline establishes with less number of nodes also it is allowing 
to continue in normal write flow. This policy is coming into picture only if 
there is a pipeline failure or append call.

I think the intention behind this feature might be here is:
In normal flow if it is trying to get only less nodes than requested means, it 
will not get even if it tried to add new nodes again with this policy. But 
append call can be even later after some time, so, there may be a chance in 
adding the new node in pipeline. But the strict condition here is to satisfy 
the replication 100% for ensuring strong tolerance.
@Nicholas, can add more on this?
                
> HDFS file append failing in multinode cluster
> ---------------------------------------------
>
>                 Key: HDFS-4600
>                 URL: https://issues.apache.org/jira/browse/HDFS-4600
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.3-alpha
>            Reporter: Roman Shaposhnik
>            Priority: Blocker
>             Fix For: 2.0.4-alpha
>
>         Attachments: core-site.xml, hdfs-site.xml, X.java
>
>
> NOTE: the following only happens in a fully distributed setup (core-site.xml 
> and hdfs-site.xml are attached)
> Steps to reproduce:
> {noformat}
> $ javac -cp /usr/lib/hadoop/client/\* X.java
> $ echo aaaaa > a.txt
> $ hadoop fs -ls /tmp/a.txt
> ls: `/tmp/a.txt': No such file or directory
> $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt
> 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[10.10.37.16:50010, 10.80.134.126:50010], 
> original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed 
> datanode replacement policy is DEFAULT, and a client may configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> Exception in thread "main" java.io.IOException: Failed to replace a bad 
> datanode on the existing pipeline due to no more good datanodes being 
> available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], 
> original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed 
> datanode replacement policy is DEFAULT, and a client may configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[10.10.37.16:50010, 10.80.134.126:50010], 
> original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed 
> datanode replacement policy is DEFAULT, and a client may configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> {noformat}
> Given that the file actually does get created:
> {noformat}
> $ hadoop fs -ls /tmp/a.txt
> Found 1 items
> -rw-r--r--   3 root hadoop          6 2013-03-13 16:05 /tmp/a.txt
> {noformat}
> this feels like a regression in APPEND's functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster

Reply via email to