[
https://issues.apache.org/jira/browse/HADOOP-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530133
]
Koji Noguchi commented on HADOOP-1938:
--------------------------------------
bq. Found 0 datanodes but MIN_REPLICATION for the cluster is configured to be
1.
I've seen this exception come up when the cluster was (semi-) full.
If this is the case, maybe better error messages would help?
> NameNode.create failed
> -----------------------
>
> Key: HADOOP-1938
> URL: https://issues.apache.org/jira/browse/HADOOP-1938
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.13.1
> Reporter: Runping Qi
>
> Under heavy load, DFS namenode fails to create file
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: Failed to create
> file /xxx/xxx/_task_0001_r_000001_0/part-00001 on client xxx.xxx.xxx.xxx
> because there were not enough datanodes available. Found 0 datanodes but
> MIN_REPLICATION for the cluster is configured to be 1.
> at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:651)
> at org.apache.hadoop.dfs.NameNode.create(NameNode.java:294)
> at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:341)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:573)
> The above problem occurred when I ran a well tuned map/reduce program on a
> hood node cluster.
> The program is well tuned in the sense that the map output data are evenly
> partitioned among 180 reducers.
> The shuffling and sorting was completed at about the same time on all the
> reducers.
> The reducers started reduce work at about the same time and were expected to
> produce about the same amount of output (2GB).
> This "synchronized" behavior caused the reducers to try to create output dfs
> files at about the same time.
> The namenode seemed to have difficulty to handle that situation, causing the
> reducers waiting on file creation for long period of time.
> Eeventually, they failed with the above exception.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.