HI There are four conditions to exclude DN..I feel, you met anyone of the following,Mostly (ii) or (iii).
i) Check if the node is (being) decommissed. ---> Can check from Namenode UI OR by exceuting hdfs dfsadmin -report ii) Check the remaining capacity of the target machine ---> Can check from Namenode UI OR by getting data dir usage should > 5 * block size OR NameNode debug logs iii) Check the communication traffic of the target machine ---> N/W usage of DN's OR Check DN logs , may be you can see too many files OR xciver count exceeds exception iv) Check if the target rack has chosen too many nodes FYR....Code which will select the target node.. private boolean isGoodTarget(DatanodeDescriptor node, long blockSize, int maxTargetPerLoc, boolean considerLoad, List<DatanodeDescriptor> results) { // check if the node is (being) decommissed if (node.isDecommissionInProgress() || node.isDecommissioned()) { if(LOG.isDebugEnabled()) { threadLocalBuilder.get().append(node.toString()).append(": ") .append("Node ").append(NodeBase.getPath(node)) .append(" is not chosen because the node is (being) decommissioned "); } return false; } long remaining = node.getRemaining() - (node.getBlocksScheduled() * blockSize); // check the remaining capacity of the target machine if (blockSize* HdfsConstants.MIN_BLOCKS_FOR_WRITE>remaining) { if(LOG.isDebugEnabled()) { threadLocalBuilder.get().append(node.toString()).append(": ") .append("Node ").append(NodeBase.getPath(node)) .append(" is not chosen because the node does not have enough space "); } return false; } // check the communication traffic of the target machine if (considerLoad) { double avgLoad = 0; int size = clusterMap.getNumOfLeaves(); if (size != 0 && stats != null) { avgLoad = (double)stats.getTotalLoad()/size; } if (node.getXceiverCount() > (2.0 * avgLoad)) { if(LOG.isDebugEnabled()) { threadLocalBuilder.get().append(node.toString()).append(": ") .append("Node ").append(NodeBase.getPath(node)) .append(" is not chosen because the node is too busy "); } return false; } } // check if the target rack has chosen too many nodes String rackname = node.getNetworkLocation(); int counter=1; for(Iterator<DatanodeDescriptor> iter = results.iterator(); iter.hasNext();) { Node result = iter.next(); if (rackname.equals(result.getNetworkLocation())) { counter++; } } if (counter>maxTargetPerLoc) { if(LOG.isDebugEnabled()) { threadLocalBuilder.get().append(node.toString()).append(": ") .append("Node ").append(NodeBase.getPath(node)) .append(" is not chosen because the rack has too many chosen nodes "); } return false; } return true; } Thanks & Regards Brahma Reddy Battula ________________________________ From: Bogdan Raducanu [lrd...@gmail.com] Sent: Tuesday, July 15, 2014 2:45 PM To: user@hadoop.apache.org Subject: Re: Not able to place enough replicas The real cause is the IOException. The PriviledgedActionException is a generic exception. Other file writes succeed in the same directory with the same user. On Tue, Jul 15, 2014 at 4:59 AM, Yanbo Liang <yanboha...@gmail.com<mailto:yanboha...@gmail.com>> wrote: Maybe the user 'test' has no privilege of write operation. You can refer the ERROR log like: org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:test (auth:SIMPLE) 2014-07-15 2:07 GMT+08:00 Bogdan Raducanu <lrd...@gmail.com<mailto:lrd...@gmail.com>>: I'm getting this error while writing many files. org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to place enough replicas, still in need of 4 to reach 4 I've set logging to DEBUG but still there is no reason printed. There should've been a reason after this line but instead there's just an empty line. Has anyone seen something like this before? It is seen on a 4 node cluster running hadoop 2.2 org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.create: file /file_1002 for DFSClient_NONMAPREDUCE_839626346_1 at 192.168.180.1 org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: src=/file_1002, holder=DFSClient_NONMAPREDUCE_839626346_1, clientMachine=192.168.180.1, createParent=true, replication=4, createFlag=[CREATE, OVERWRITE] org.apache.hadoop.hdfs.StateChange: DIR* addFile: /file_1002 is added org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: add /file_1002 to namespace for DFSClient_NONMAPREDUCE_839 << ... many other operations ... >> 8 seconds later: org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /file_1002 fileId=189252 for DFSClient_NONMAPREDUCE_839626346_1 org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file /file_1002 for DFSClient_NONMAPREDUCE_839626346_1 org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to place enough replicas, still in need of 4 to reach 4 << EMPTY LINE >> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:test (auth:SIMPLE) cause:java.io.IOException: File /file_1002 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.180.1:49592<http://192.168.180.1:49592> Call#1321 Retry#0: error: java.io.IOException: File /file_1002 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. java.io.IOException: File /file_1002 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042):0