RE: Intermittent DataStreamer Exception while appending to file inside HDFS

2013-10-17 Thread Uma Maheswara Rao G
Hi Arinto,

You can check 3rd DN logs. Whether any space issues so that node was not 
selected for write etc.

 Does it mean that one of the datanodes was unreachable when we try to append 
 into the files?
It did not select for write itself. If it failed after selected for write means 
you should have get this error while recovery itself.

Regards,
Uma

From: Arinto Murdopo [mailto:ari...@gmail.com]
Sent: 11 October 2013 08:48
To: user@hadoop.apache.org
Subject: Re: Intermittent DataStreamer Exception while appending to file inside 
HDFS

Thank you for the comprehensive answer,
When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it mean 
that one of the datanodes was unreachable when we try to append into the files?
Best regards,


Arinto
www.otnira.comhttp://www.otnira.com

On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G 
mahesw...@huawei.commailto:mahesw...@huawei.com wrote:
Hi Arinto,

Please disable this feature with smaller clusters. 
dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you 
have only 2 nodes in the cluster from the logs. When you first time created 
pipeline we will not do any verification i.e, whether pipeline DNs met the 
replication or not. Above property says only replace DN on failure. But here 
additionally we took advantage of verifying this condition when we reopen the 
pipeline for append. So, here unfortunately it will not meet the replication 
with existing DNs and it will try to add another node. Since you are not having 
any extra nodes in cluster other than selected nodes, it will fail. With the 
current configurations you can not append.


Also please take a look at default configuration description:
namedfs.client.block.write.replace-datanode-on-failure.enable/name
  valuetrue/value
  description
If there is a datanode/network failure in the write pipeline,
DFSClient will try to remove the failed datanode from the pipeline
and then continue writing with the remaining datanodes. As a result,
the number of datanodes in the pipeline is decreased.  The feature is
to add new datanodes to the pipeline.

This is a site-wide property to enable/disable the feature.

When the cluster size is extremely small, e.g. 3 nodes or less, cluster
administrators may want to set the policy to NEVER in the default
configuration file or disable this feature.  Otherwise, users may
experience an unusually high rate of pipeline failures since it is
impossible to find new datanodes for replacement.

See also dfs.client.block.write.replace-datanode-on-failure.policy
  /description


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:ari...@gmail.commailto:ari...@gmail.com]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. 
This error appears intermittently. If the error does not show up, I can append 
the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature 
by setting dfs.client.block.write.replace-datanode-on-failure.policy in 
configuration, where the current policy is DEFAULT.  (Nodes: 
current=[10.0.106.82:50010http://10.0.106.82:50010, 
10.0.106.81:50010http://10.0.106.81:50010], 
original=[10.0.106.82:50010http://10.0.106.82:50010, 
10.0.106.81:50010http://10.0.106.81:50010])

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: 
https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml








2. core-site.xml: 
https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml







So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. 
StackOverflowhttp://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop,
 our replication factor is 3 and we've never changed the replication factor 
since we setup the cluster.
2. Impala-User mailing

RE: Intermittent DataStreamer Exception while appending to file inside HDFS

2013-10-10 Thread Uma Maheswara Rao G
Hi Arinto,

Please disable this feature with smaller clusters. 
dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you 
have only 2 nodes in the cluster from the logs. When you first time created 
pipeline we will not do any verification i.e, whether pipeline DNs met the 
replication or not. Above property says only replace DN on failure. But here 
additionally we took advantage of verifying this condition when we reopen the 
pipeline for append. So, here unfortunately it will not meet the replication 
with existing DNs and it will try to add another node. Since you are not having 
any extra nodes in cluster other than selected nodes, it will fail. With the 
current configurations you can not append.


Also please take a look at default configuration description:
namedfs.client.block.write.replace-datanode-on-failure.enable/name
  valuetrue/value
  description
If there is a datanode/network failure in the write pipeline,
DFSClient will try to remove the failed datanode from the pipeline
and then continue writing with the remaining datanodes. As a result,
the number of datanodes in the pipeline is decreased.  The feature is
to add new datanodes to the pipeline.

This is a site-wide property to enable/disable the feature.

When the cluster size is extremely small, e.g. 3 nodes or less, cluster
administrators may want to set the policy to NEVER in the default
configuration file or disable this feature.  Otherwise, users may
experience an unusually high rate of pipeline failures since it is
impossible to find new datanodes for replacement.

See also dfs.client.block.write.replace-datanode-on-failure.policy
  /description


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:ari...@gmail.com]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. 
This error appears intermittently. If the error does not show up, I can append 
the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature 
by setting dfs.client.block.write.replace-datanode-on-failure.policy in 
configuration, where the current policy is DEFAULT.  (Nodes: 
current=[10.0.106.82:50010http://10.0.106.82:50010, 
10.0.106.81:50010http://10.0.106.81:50010], 
original=[10.0.106.82:50010http://10.0.106.82:50010, 
10.0.106.81:50010http://10.0.106.81:50010])

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

   at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: 
https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml




2. core-site.xml: 
https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml



So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. 
StackOverflowhttp://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop,
 our replication factor is 3 and we've never changed the replication factor 
since we setup the cluster.
2. Impala-User mailing 
listhttps://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J:
 the error here is due to replication factor set to 1. In our case, we're using 
replication factor = 3

Best regards,

Arinto
www.otnira.comhttp://www.otnira.com


Re: Intermittent DataStreamer Exception while appending to file inside HDFS

2013-10-10 Thread Arinto Murdopo
Thank you for the comprehensive answer,

When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it
mean that one of the datanodes was unreachable when we try to append into
the files?

Best regards,


Arinto
www.otnira.com


On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G
mahesw...@huawei.comwrote:

  Hi Arinto,

 ** **

 Please disable this feature with smaller clusters.
 dfs.client.block.write.replace-datanode-on-failure.policy

 Reason for this exception is, you have replication set to 3 and looks like
 you have only 2 nodes in the cluster from the logs. When you first time
 created pipeline we will not do any verification i.e, whether pipeline DNs
 met the replication or not. Above property says only replace DN on failure.
 But here additionally we took advantage of verifying this condition when we
 reopen the pipeline for append. So, here unfortunately it will not meet the
 replication with existing DNs and it will try to add another node. Since
 you are not having any extra nodes in cluster other than selected nodes, it
 will fail. With the current configurations you can not append. 

 ** **

 ** **

 Also please take a look at default configuration description:

 namedfs.client.block.write.replace-datanode-on-failure.enable/name

   valuetrue/value

   description

 If there is a datanode/network failure in the write pipeline,

 DFSClient will try to remove the failed datanode from the pipeline

 and then continue writing with the remaining datanodes. As a result,**
 **

 the number of datanodes in the pipeline is decreased.  The feature is*
 ***

 to add new datanodes to the pipeline.

 ** **

 This is a site-wide property to enable/disable the feature.

 ** **

 When the cluster size is extremely small, e.g. 3 nodes or less, cluster
 

 administrators may want to set the policy to NEVER in the default

 configuration file or disable this feature.  Otherwise, users may

 experience an unusually high rate of pipeline failures since it is

 impossible to find new datanodes for replacement.

 ** **

 See also dfs.client.block.write.replace-datanode-on-failure.policy

   /description

 ** **

 ** **

 Make this configuration false at your client side.

 ** **

 Regards,

 Uma ** **

 ** **

 ** **

 *From:* Arinto Murdopo [mailto:ari...@gmail.com]
 *Sent:* 10 October 2013 13:02
 *To:* user@hadoop.apache.org
 *Subject:* Intermittent DataStreamer Exception while appending to file
 inside HDFS

 ** **

 Hi there, 

 I have this following exception while I'm appending existing file in my
 HDFS. This error appears intermittently. If the error does not show up, I
 can append the file successfully. If the error appears, I could not append
 the file.

 Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c

 For your convenience, here it is:

 13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

 java.io.IOException: Failed to add a datanode.  User may turn off this 
 feature by setting dfs.client.block.write.replace-datanode-on-failure.policy 
 in configuration, where the current policy is DEFAULT.  (Nodes: 
 current=[10.0.106.82:50010, 10.0.106.81:50010], original=[10.0.106.82:50010, 
 10.0.106.81:50010])

at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

 Some configuration files:

 1. hdfs-site.xml: 
 https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml

 

 ** **

 2. core-site.xml: 
 https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml

 

 ** **

 So, any idea how to solve this issue? 

 Some links that I've found (but unfortunately they do not help)

 1. 
 StackOverflowhttp://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop,
 our replication factor is 3 and we've never changed the replication factor
 since we setup the cluster. 

 2. Impala-User mailing 
 listhttps://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J:
 the error here is due to replication factor set to 1. In our case, we're
 using replication factor = 3

 ** **

 Best regards, 

 ** **

 Arinto

 www.otnira.com