Hye Azuryy, During the import, dfsadmin -report :
DFS Used%: 17.72% Moreover, it succeeds from time to time w/ the same data load. It seems that Datanode appears to be down to the Namenode, but why ? On Thu, Jul 4, 2013 at 3:31 AM, Azuryy Yu <azury...@gmail.com> wrote: > Hi Manuel, > > 2013-07-03 15:03:16,427 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place > enough replicas, still in need of 3 > 2013-07-03 15:03:16,427 ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:root cause:java.io.IOException: File /log/1372863795616 could only be > replicated to 0 nodes, instead of 1 > > > This indicates you haven't enough space on the HDFS. can you check the > cluster capacity used? > > > > > On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran < > manuel.defer...@gmail.com> wrote: > >> Greetings all, >> >> we try to import data to an HDFS cluster, but we face random Exception. >> We try to figure out what is the root cause: misconfiguration, too much >> load, ... and how to solve that. >> >> The client writes hundred of files with a replication factor of 3. It >> crashes sometimes at the beginning, sometimes close to the end, and in rare >> case it succeeds. >> >> On failure, we have on client side: >> DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: >> java.io.IOException: File /log/1372863795616 could only be replicated to 0 >> nodes, instead of 1 >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696) >> .... >> >> which seems to be well known. We have followed the hints from the >> Troubleshooting page, but we're still stuck: lots of disk available on >> datanodes, free inodes, far below the open files limit , all datanodes are >> up and running. >> >> Note that we have other HDFS clients that are still able to write files >> while import is running. >> >> Here is the corresponding extract of the namenode log file: >> >> 2013-07-03 15:03:15,951 INFO >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >> transactions: 46009 Total time for transactions(ms): 153Number of >> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms): >> 139555 >> 2013-07-03 15:03:16,427 WARN >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place >> enough replicas, still in need of 3 >> 2013-07-03 15:03:16,427 ERROR >> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException >> as:root cause:java.io.IOException: File /log/1372863795616 could only be >> replicated to 0 nodes, instead of 1 >> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617, >> null) from 192.168.1.141:41376: error: java.io.IOException: File >> /log/1372863795616 could only be replicated to 0 nodes, instead of 1 >> java.io.IOException: File /log/1372863795616 could only be replicated to >> 0 nodes, instead of 1 >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696) >> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >> >> >> During the process, fsck reports about 300 of open files. The cluster is >> running hadoop-1.0.3. >> >> Any advice about the configuration ? We tried to >> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k >> maybe raising dfs.datanode.handler.count ? >> >> >> Thanks for your help >> > >