[ 
https://issues.apache.org/jira/browse/HDFS-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791092#comment-16791092
 ] 

Íñigo Goiri commented on HDFS-14366:
------------------------------------

I think this is significant enough to be worth a more readable approach like:
{code}
final int liveReplicas = countNodes(b).liveReplicas();
if (liveReplicas >= minReplication) {
  return true;
}
// getNumLiveDataNodes() is very expensive and we minimize its use
return liveReplicas >= getDatanodeManager().getNumLiveDataNodes();
{code}

> Improve HDFS append performance
> -------------------------------
>
>                 Key: HDFS-14366
>                 URL: https://issues.apache.org/jira/browse/HDFS-14366
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 2.8.2
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>         Attachments: HDFS-14366.000.patch, append-flamegraph.png
>
>
> In our HDFS cluster we observed that {{append}} operation can take as much as 
> 10X write lock time than other write operations. By collecting flamegraph on 
> the namenode (see attachment: append-flamegraph.png), we found that most of 
> the append call is spent on {{getNumLiveDataNodes()}}:
> {code}
>   /** @return the number of live datanodes. */
>   public int getNumLiveDataNodes() {
>     int numLive = 0;
>     synchronized (this) {
>       for(DatanodeDescriptor dn : datanodeMap.values()) {
>         if (!isDatanodeDead(dn) ) {
>           numLive++;
>         }
>       }
>     }
>     return numLive;
>   }
> {code}
> this method synchronizes on the {{DatanodeManager}} which is particularly 
> expensive in large clusters since {{datanodeMap}} is being modified in many 
> places such as processing DN heartbeats.
> For {{append}} operation, {{getNumLiveDataNodes()}} is invoked in 
> {{isSufficientlyReplicated}}:
> {code}
>   /**
>    * Check if a block is replicated to at least the minimum replication.
>    */
>   public boolean isSufficientlyReplicated(BlockInfo b) {
>     // Compare against the lesser of the minReplication and number of live 
> DNs.
>     final int replication =
>         Math.min(minReplication, getDatanodeManager().getNumLiveDataNodes());
>     return countNodes(b).liveReplicas() >= replication;
>   }
> {code}
> The way that the {{replication}} is calculated is not very optimal, as it 
> will call {{getNumLiveDataNodes()}} _every time_ even though usually 
> {{minReplication}} is much smaller than the latter. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to