[ https://issues.apache.org/jira/browse/HDFS-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951493#comment-13951493 ]
Nathan Roberts commented on HDFS-6166: -------------------------------------- Maybe our two comments passed in the mail. Yes I tested internally. It's been running on a 400 node cluster for 1 day. I ran with bandwidths of 500K, 6MB, 20MB. With 500K there were timeouts, but no thread quota exceeded failures. > revisit balancer so_timeout > ---------------------------- > > Key: HDFS-6166 > URL: https://issues.apache.org/jira/browse/HDFS-6166 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer > Affects Versions: 3.0.0, 2.3.0 > Reporter: Nathan Roberts > Assignee: Nathan Roberts > Priority: Blocker > Attachments: HDFS-6166.patch > > > HDFS-5806 changed the socket read timeout for the balancer connection to DN > to 60 seconds. This works as long as balancer bandwidth is such that it's > safe to assume that the DN will easily complete the operation within this > time. Obviously this isn't a good assumption. When this assumption isn't > valid, the balancer will timeout the cmd BUT it will then be out-of-sync with > the datanode (balancer thinks the DN has room to do more work, DN is still > working on the request and will fail any subsequent requests with "threads > quota exceeded errors"). This causes expensive NN traffic via getBlocks() and > also causes lots of WARNS int the balancer log. > Unfortunately the protocol is such that it's impossible to tell if the DN is > busy working on replacing the block, OR is in bad shape and will never finish. > So, in the interest of a small change to deal with both situations, I propose > the following two changes: > * Crank of the socket read timeout to 20 minutes > * Delay looking at a node for a bit if we did timeout in this way (the DN > could still have xceiver threads working on the replace -- This message was sent by Atlassian JIRA (v6.2#6252)