[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brahma Reddy Battula reopened HDFS-11280: ----------------------------------------- > Allow WebHDFS to reuse HTTP connections to NN > --------------------------------------------- > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 > Reporter: Zheng Shao > Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org