[ 
https://issues.apache.org/jira/browse/HDFS-14579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867098#comment-16867098
 ] 

Íñigo Goiri commented on HDFS-14579:
------------------------------------

Yes, refreshNodes is slightly different:
{code}
"IPC Server handler 124 on 8020" #506 daemon prio=5 os_prio=0 
tid=0x000000006f23f000 nid=0xc1c runnable [0x0000001a8fcfd000]
   java.lang.Thread.State: RUNNABLE
        at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName(InetAddress.java:1192)
        at java.net.InetAddress.getAllByName(InetAddress.java:1126)
        at java.net.InetAddress.getByName(InetAddress.java:1076)
        at java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager.parseEntry(HostFileManager.java:94)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager.readFile(HostFileManager.java:80)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager.refresh(HostFileManager.java:157)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager.refresh(HostFileManager.java:70)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.refreshHostsReader(DatanodeManager.java:1183)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.refreshNodes(DatanodeManager.java:1165)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.refreshNodes(FSNamesystem.java:4554)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.refreshNodes(NameNodeRpcServer.java:1215)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.refreshNodes(ClientNamenodeProtocolServerSideTranslatorPB.java:823)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:514)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1011)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:889)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:835)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2639)
{code}

I think for my case it may make more sense to make this in parallel.

> In refreshNodes, avoid performing a DNS lookup while holding the write lock
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-14579
>                 URL: https://issues.apache.org/jira/browse/HDFS-14579
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.3.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-14579.001.patch
>
>
> When refreshNodes is called on a large cluster, or a cluster where DNS is not 
> performing well, it can cause the namenode to hang for a long time. This is 
> because the refreshNodes operation holds the global write lock while it is 
> running. Most of refreshNodes code is simple and hence fast, but 
> unfortunately it performs a DNS lookup for each host in the cluster while the 
> lock is held. 
> Right now, it calls:
> {code}
>   public void refreshNodes(final Configuration conf) throws IOException {
>     refreshHostsReader(conf);
>     namesystem.writeLock();
>     try {
>       refreshDatanodes();
>       countSoftwareVersions();
>     } finally {
>       namesystem.writeUnlock();
>     }
>   }
> {code}
> The line refreshHostsReader(conf); reads the new config file and does a DNS 
> lookup on each entry - the write lock is not held here. Then the main work is 
> done here:
> {code}
>   private void refreshDatanodes() {
>     final Map<String, DatanodeDescriptor> copy;
>     synchronized (this) {
>       copy = new HashMap<>(datanodeMap);
>     }
>     for (DatanodeDescriptor node : copy.values()) {
>       // Check if not include.
>       if (!hostConfigManager.isIncluded(node)) {
>         node.setDisallowed(true);
>       } else {
>         long maintenanceExpireTimeInMS =
>             hostConfigManager.getMaintenanceExpirationTimeInMS(node);
>         if (node.maintenanceNotExpired(maintenanceExpireTimeInMS)) {
>           datanodeAdminManager.startMaintenance(
>               node, maintenanceExpireTimeInMS);
>         } else if (hostConfigManager.isExcluded(node)) {
>           datanodeAdminManager.startDecommission(node);
>         } else {
>           datanodeAdminManager.stopMaintenance(node);
>           datanodeAdminManager.stopDecommission(node);
>         }
>       }
>       node.setUpgradeDomain(hostConfigManager.getUpgradeDomain(node));
>     }
>   }
> {code}
> All the isIncluded(), isExcluded() methods call node.getResolvedAddress() 
> which does the DNS lookup. We could probably change things to perform all the 
> DNS lookups outside of the write lock, and then take the lock and process the 
> nodes. Also change or overload isIncluded() etc to take the inetAddress 
> rather than the datanode descriptor.
> It would not shorten the time the operation takes to run overall, but it 
> would move the long duration out of the write lock and avoid blocking the 
> namenode for the entire time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to