[ https://issues.apache.org/jira/browse/HDFS-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
JiangHua Zhu reassigned HDFS-16565: ----------------------------------- Assignee: JiangHua Zhu > Optimize DataNode#DataTransfer, when encountering NoRouteToHostException > ------------------------------------------------------------------------ > > Key: HDFS-16565 > URL: https://issues.apache.org/jira/browse/HDFS-16565 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.3.0 > Reporter: JiangHua Zhu > Assignee: JiangHua Zhu > Priority: Major > > When DataTransfer runs, the local node needs to connect to another DataNode, > which is through socket. Once the connection fails, a NoRouteToHostException > will be generated. > Exception information: > 2022-04-29 15:47:47,931 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(xxxx.xxxx.xxxx.xxxx:1004, > datanodeUuid=xxxx.xxxx.xxxx.xxxx, infoPort=1006 , infoSecurePort=0, > ipcPort=8025, > storageInfo=lv=-57;cid=xxxx.xxxx.xxxx.xxxx;nsid=961284063;c=1589290804417):Failed > to transfer BP-1375239094-xxxx.xxxx.xxxx.xxxx- > 1589290804417:blk_-9223372035798255743_66037710 to xxxx.xxxx.xxx.xxxx:1004 got > java.net.NoRouteToHostException: No route to host > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:497) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2562) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > The source of the accident: > sock = newSocket(); > NetUtils.connect(sock, curTarget, dnConf.socketTimeout); > sock.setTcpNoDelay(dnConf.getDataTransferServerTcpNoDelay()); > sock.setSoTimeout(targets.length * dnConf.socketTimeout); > When a NoRouteToHostException occurs, the Block will be added to the > VolumeScanner, and the VolumeScanner will start working to scan the Block. > This should not happen because this is not a real IOException. > catch (IOException ie) { > handleBadBlock(b, ie, false); > LOG.warn("{}:Failed to transfer {} to {} got", > bpReg, b, targets[0], ie); > } -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org