[ https://issues.apache.org/jira/browse/HDFS-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HDFS-16668: ---------------------------------- Labels: pull-request-available (was: ) > Clean up MoverExecutor after each iteration to avoid potential thread leak > -------------------------------------------------------------------------- > > Key: HDFS-16668 > URL: https://issues.apache.org/jira/browse/HDFS-16668 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 3.3.3 > Reporter: Tai Zhou > Priority: Major > Labels: pull-request-available > Attachments: screenshot-1.png, screenshot-2.png > > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am working on a HDFS smart storage management project recently. It is based > on the Mover in Hadoop-hdfs project. I noticed that most code in Mover is > similar to Balancer. However, Mover doesn't clean up MoverExecutor as > Balancer does. > If we have multiple NameSystem for Namenode Connectors or have a large number > of datanodes, Mover will result in threads leaking because there might be > numerous iterations to process these namespaces. Like our project, we > modified some source code so that we can use mover.run() once we found the > blocks did not match the expected storage policies. So our application will > initialize Namenode Connector and Mover continually. It turns out we have > thousands of threads or threads pools for MoverExecutor. > here is what it looks like. We can see here are 9000+ threads like this in > WAIT condition. > !screenshot-2.png|width=558,height=209! > I know generally users may not use Mover like us. They might use it by CLI. > But more and more users are planing to apply RBF or multiple NameSystems, or > with a large cluster of datanodes. Mover CLI have to keep more than thousands > of thread after pressing the enter key. > I have pulled a quick fix code, if you guys are interested, plz take a look > at it. > thx. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org