[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873691#comment-13873691 ] Suresh Srinivas commented on HDFS-4600: --- [~cos], looks like you changed the priority. Is this still an issue? If not, I plan on closing this as not a problem in a day or so. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Attachments: X.java, core-site.xml, hdfs-site.xml NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873695#comment-13873695 ] Konstantin Boudnik commented on HDFS-4600: -- Suresh, I didn't see its fixed, so yes - this is still seems to be an issue. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Attachments: X.java, core-site.xml, hdfs-site.xml NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873712#comment-13873712 ] Suresh Srinivas commented on HDFS-4600: --- I am actually surprised. Many people have expressed that this is not a bug. You also have expressed the same opinion in the comments above. What changed? HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Attachments: X.java, core-site.xml, hdfs-site.xml NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873728#comment-13873728 ] Konstantin Boudnik commented on HDFS-4600: -- Actually you're right. I've re-read the history of the ticket and will close it right away. Please disregard my last comment ;) HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Attachments: X.java, core-site.xml, hdfs-site.xml NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831029#comment-13831029 ] Uma Maheswara Rao G commented on HDFS-4600: --- Hi Roman, Do you still think something needs to be addressed here? or we can close this bug? HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Attachments: X.java, core-site.xml, hdfs-site.xml NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604329#comment-13604329 ] Uma Maheswara Rao G commented on HDFS-4600: --- [~tucu00] {quote}, why does it work OK in a pseudo cluster setup then?{quote} In pseudo mode replication would set to 1 right with single DN. So, in Append call it will not try adding any new node as it is meeting replication. In fully distributed mode replication would be default to 3. But here we have only 2 nodes in cluster. {code} and yet in the very same scenario the plain write would be successful. All I am saying is that there's a surprising inconsistency here. {code} This feature is to check only if there are pipeline failures. see the property name dfs.client.block.write.replace-datanode-on-failure.enable . Here additionally we are checking even in append as we can have chance here to add node to pipeline if there are less. Ideally NN is not giving enough nodes means, 1) cluster it would not have good nodes 2) cluster is not having many nodes as many expected. So, here both the cases we can not replace with new nodes. For the #2, we are not recommending to enable this feature. For #1, I don't think this will happen in any normal clusters with more nodes. Because pipeline setup will happen normally if there are available nodes. It may fail later if there are NW issues or any crashes etc, that time anyway recovery will trigger and this feature will come into picture to not reducing the nodes in pipeline. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603472#comment-13603472 ] Alejandro Abdelnur commented on HDFS-4600: -- Uma, Nicholas, thanks for the details. I was looking at this issue with Roman, why does it work OK in a pseudo cluster setup then? I think we can lower the priority since there is a workaround. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603486#comment-13603486 ] Konstantin Boudnik commented on HDFS-4600: -- dudes, it isn't a bug at all, actually. It shouldn't be closed as a dup: it should be closed as invalid. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603592#comment-13603592 ] Roman Shaposhnik commented on HDFS-4600: As Colin pointed out -- there are different ways of looking at it. I still consider the difference between READ and APPEND code path to be artificial and surprising to the user. Imagine a POSIX open(2) giving you different behavior based on whether you specified O_APPEND or not. What I'm worried about here is not a pathological use case of a 2 node cluster but cases where files are over-replicated (for the purposes of availability) and can NOT be appended to by default. Of course, given that this can be controlled on a client side -- this makes it less of a problem. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603755#comment-13603755 ] Konstantin Boudnik commented on HDFS-4600: -- bq. where files are over-replicated (for the purposes of availability) and can NOT be appended to by default I won't worry about it, because if a block being over-replicated - e.g. because of a couple of datanodes went offline, and it causes overcompensation on the part of NN - the over-replication will eventually go away as NN will balance the number of replicas once and if the original DNs are back. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603760#comment-13603760 ] Roman Shaposhnik commented on HDFS-4600: bq. the over-replication will eventually go away as NN will balance the number of replicas once and if the original DNs are back. Right. But in the meantime APPEND would be unavailable to *unsuspecting* applications. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603771#comment-13603771 ] Konstantin Boudnik commented on HDFS-4600: -- well, yeah... but if the cluster doesn't have enough resources for the append then append shouldn't be happening. This is essentially the safe-bet behind this design decision (see [~szetszwo] comment above) HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603781#comment-13603781 ] Roman Shaposhnik commented on HDFS-4600: bq. well, yeah... but if the cluster doesn't have enough resources and yet in the very same scenario the plain write would be successful. All I am saying is that there's a surprising inconsistency here. Thanks to Nicholas I now understand the design ramifications and yet I find it slightly unfortunate that such inconsistency is there. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Minor Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603036#comment-13603036 ] Tsz Wo (Nicholas), SZE commented on HDFS-4600: -- Hi Roman, append and create are different in the sense that there is existing data in before append. The existing data were already persisted in HDFS earlier. The replace-datanode-on-failure feature is to prevent data lost. For append, it makes sure there are enough datanodes to protect the existing data. It is very useful for slow appender. The feature can be disabled by setting dfs.client.block.write.replace-datanode-on-failure.enable if it is undesirable. This JIRA is similar to HDFS-3091. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601968#comment-13601968 ] Colin Patrick McCabe commented on HDFS-4600: Just a guess, but do you have a 2-node cluster, and replication set to 3? (I checked the XML conf files you attached and didn't see any evidence of replication set to something other than the default of 3). The decision to throw an exception when we can't get up to the full replication factor was a design decision as I understand-- thought it has led to some (in my opinion) counter-intuitive behavior. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601986#comment-13601986 ] Roman Shaposhnik commented on HDFS-4600: It is indeed a 2 nodes cluster. That said, the first write to the file actually succeeds and it is only the append call that fails. The cluster is also fully functional wrt. the hadoop fs use. I can copy stuff in/out of it. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602020#comment-13602020 ] Uma Maheswara Rao G commented on HDFS-4600: --- if the cluster size is 2 nodes, we recommend to disable this feature as we can not replace with new node in any case. Look at the comment for the configuration in hdfs-default.xml: {quote} If there is a datanode/network failure in the write pipeline, DFSClient will try to remove the failed datanode from the pipeline and then continue writing with the remaining datanodes. As a result, the number of datanodes in the pipeline is decreased. The feature is to add new datanodes to the pipeline. This is a site-wide property to enable/disable the feature. When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new datanodes for replacement. See also dfs.client.block.write.replace-datanode-on-failure.policy {quote} to disable this, we can set dfs.client.block.write.replace-datanode-on-failure.enable to false. Here existing/selected nodes will be only 2 and replication will be still set to 3? if So, policy might satisfied and trying to add another in pipeline I think. Alternatively you can set replication to 2 as your cluster max size is only 2. If replication is less than 3, this policy will not be used. With smaller clusters It is always recommended to disable this fature. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602023#comment-13602023 ] Roman Shaposhnik commented on HDFS-4600: I would appreciate if somebody could explain the difference in behavior between a simple write and an append. Whatever the policy is it should affect both equally. Yet one is successful and the other is not. At this point I'm really suspicious that of this one is different between the two -- what else could there be? Do I have, from now on, explicitly test append on my clusters because the code path and applicable policies are different between the two? I would certainly hope not. Please comment. HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
[ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602035#comment-13602035 ] Uma Maheswara Rao G commented on HDFS-4600: --- Yeah, here behaviour is different for normal write and append flow with respective to this policy I think. This particular policy check happens only in setupPipelineForAppendOrRecovery. If initially pipeline establishes with less number of nodes also it is allowing to continue in normal write flow. This policy is coming into picture only if there is a pipeline failure or append call. I think the intention behind this feature might be here is: In normal flow if it is trying to get only less nodes than requested means, it will not get even if it tried to add new nodes again with this policy. But append call can be even later after some time, so, there may be a chance in adding the new node in pipeline. But the strict condition here is to satisfy the replication 100% for ensuring strong tolerance. @Nicholas, can add more on this? HDFS file append failing in multinode cluster - Key: HDFS-4600 URL: https://issues.apache.org/jira/browse/HDFS-4600 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.0.4-alpha Attachments: core-site.xml, hdfs-site.xml, X.java NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached) Steps to reproduce: {noformat} $ javac -cp /usr/lib/hadoop/client/\* X.java $ echo a a.txt $ hadoop fs -ls /tmp/a.txt ls: `/tmp/a.txt': No such file or directory $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) Exception in thread main java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) {noformat} Given that the file actually does get created: {noformat} $ hadoop fs -ls /tmp/a.txt Found 1 items -rw-r--r-- 3 root hadoop 6 2013-03-13 16:05 /tmp/a.txt {noformat} this feels like a regression in APPEND's functionality. -- This message is automatically generated by