RE: Not able to place enough replicas

Brahma Reddy Battula Tue, 15 Jul 2014 04:37:18 -0700

HI

 There are four conditions to exclude DN..I feel, you met anyone of the 
following,Mostly (ii) or (iii).



i) Check if the node is (being) decommissed.

---> Can check from Namenode UI OR by exceuting hdfs dfsadmin -report


ii) Check the remaining capacity of the target machine

---> Can check from Namenode UI OR by getting data dir usage should > 5 * block 
size OR NameNode debug logs


iii) Check the communication traffic of the target machine

---> N/W usage of DN's OR Check DN logs , may be you can see too many files OR 
xciver count exceeds exception


iv) Check if the target rack has chosen too many nodes


FYR....Code which will select the target node..


private boolean isGoodTarget(DatanodeDescriptor node,
                               long blockSize, int maxTargetPerLoc,
                               boolean considerLoad,
                               List<DatanodeDescriptor> results) {
    // check if the node is (being) decommissed
    if (node.isDecommissionInProgress() || node.isDecommissioned()) {
      if(LOG.isDebugEnabled()) {
        threadLocalBuilder.get().append(node.toString()).append(": ")
          .append("Node ").append(NodeBase.getPath(node))
          .append(" is not chosen because the node is (being) decommissioned ");
      }
      return false;
    }

    long remaining = node.getRemaining() -
                     (node.getBlocksScheduled() * blockSize);
    // check the remaining capacity of the target machine
    if (blockSize* HdfsConstants.MIN_BLOCKS_FOR_WRITE>remaining) {
      if(LOG.isDebugEnabled()) {
        threadLocalBuilder.get().append(node.toString()).append(": ")
          .append("Node ").append(NodeBase.getPath(node))
          .append(" is not chosen because the node does not have enough space 
");
      }
      return false;
    }

    // check the communication traffic of the target machine
    if (considerLoad) {
      double avgLoad = 0;
      int size = clusterMap.getNumOfLeaves();
      if (size != 0 && stats != null) {
        avgLoad = (double)stats.getTotalLoad()/size;
      }
      if (node.getXceiverCount() > (2.0 * avgLoad)) {
        if(LOG.isDebugEnabled()) {
          threadLocalBuilder.get().append(node.toString()).append(": ")
            .append("Node ").append(NodeBase.getPath(node))
            .append(" is not chosen because the node is too busy ");
        }
        return false;
      }
    }

    // check if the target rack has chosen too many nodes
    String rackname = node.getNetworkLocation();
    int counter=1;
    for(Iterator<DatanodeDescriptor> iter = results.iterator();
        iter.hasNext();) {
      Node result = iter.next();
      if (rackname.equals(result.getNetworkLocation())) {
        counter++;
      }
    }
    if (counter>maxTargetPerLoc) {
      if(LOG.isDebugEnabled()) {
        threadLocalBuilder.get().append(node.toString()).append(": ")
          .append("Node ").append(NodeBase.getPath(node))
          .append(" is not chosen because the rack has too many chosen nodes ");
      }
      return false;
    }
    return true;
  }






Thanks & Regards



Brahma Reddy Battula




________________________________
From: Bogdan Raducanu [lrd...@gmail.com]
Sent: Tuesday, July 15, 2014 2:45 PM
To: user@hadoop.apache.org
Subject: Re: Not able to place enough replicas

The real cause is the IOException. The PriviledgedActionException is a generic 
exception. Other file writes succeed in the same directory with the same user.


On Tue, Jul 15, 2014 at 4:59 AM, Yanbo Liang 
<yanboha...@gmail.com<mailto:yanboha...@gmail.com>> wrote:
Maybe the user 'test' has no privilege of write operation.
You can refer the ERROR log like:

org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
as:test (auth:SIMPLE)


2014-07-15 2:07 GMT+08:00 Bogdan Raducanu 
<lrd...@gmail.com<mailto:lrd...@gmail.com>>:

I'm getting this error while writing many files.
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to 
place enough replicas, still in need of 4 to reach 4

I've set logging to DEBUG but still there is no reason printed. There should've 
been a reason after this line but instead there's just an empty line.
Has anyone seen something like this before? It is seen on a 4 node cluster 
running hadoop 2.2


org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.create: file /file_1002 for 
DFSClient_NONMAPREDUCE_839626346_1 at 192.168.180.1
org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: src=/file_1002, 
holder=DFSClient_NONMAPREDUCE_839626346_1, clientMachine=192.168.180.1, 
createParent=true, replication=4, createFlag=[CREATE, OVERWRITE]
org.apache.hadoop.hdfs.StateChange: DIR* addFile: /file_1002 is added
org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: add /file_1002 
to namespace for DFSClient_NONMAPREDUCE_839
<< ... many other operations ... >>
8 seconds later:
org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /file_1002 
fileId=189252 for DFSClient_NONMAPREDUCE_839626346_1
org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file 
/file_1002 for DFSClient_NONMAPREDUCE_839626346_1
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to 
place enough replicas, still in need of 4 to reach 4
<< EMPTY LINE >>
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
as:test (auth:SIMPLE) cause:java.io.IOException: File /file_1002 could only be 
replicated to 0 nodes instead of minReplication (=1).  There are 4 datanode(s) 
running and no node(s) are excluded in this operation.
org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
192.168.180.1:49592<http://192.168.180.1:49592> Call#1321 Retry#0: error: 
java.io.IOException: File /file_1002 could only be replicated to 0 nodes 
instead of minReplication (=1).  There are 4 datanode(s) running and no node(s) 
are excluded in this operation.
java.io.IOException: File /file_1002 could only be replicated to 0 nodes 
instead of minReplication (=1).  There are 4 datanode(s) running and no node(s) 
are excluded in this operation.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042):0

RE: Not able to place enough replicas

Reply via email to