[ https://issues.apache.org/jira/browse/HDFS-15715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhengchenyu updated HDFS-15715: ------------------------------- Attachment: (was: image-2021-02-25-14-41-49-394.png) > ReplicatorMonitor performance degrades, when the storagePolicy of many file > are not match with their real datanodestorage > -------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-15715 > URL: https://issues.apache.org/jira/browse/HDFS-15715 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 2.7.3, 3.2.1 > Reporter: zhengchenyu > Assignee: zhengchenyu > Priority: Major > Fix For: 3.3.1 > > Attachments: HDFS-15715.001.patch, HDFS-15715.002.patch, > HDFS-15715.002.patch.addendum, image-2021-03-26-12-17-45-500.png > > > One of our Namenode which has 300M files and blocks. In common way, this > namode shoud not be in heavy load. But we found rpc process time keep high, > and decommission is very slow. > > I search the metrics, I found uderreplicated blocks keep high. Then I jstack > namenode, found 'InnerNode.getLoc' is hot spot cod. I think maybe > chooseTarget can't find block, so result to performance degradation. Consider > with HDFS-10453, I guess maybe some logical trigger to the scene where > chooseTarget can't find proper block. > Then I enable some debug. (Of course I revise some code so that only debug > isGoodTarget, because enable BlockPlacementPolicy's debug log is dangrouse). > I found "the rack has too many chosen nodes" is called. Then I found some log > like this > {code} > 2020-12-04 12:13:56,345 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 0 to reach 3 (unavailableStorages=[], > storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For > more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > 2020-12-04 12:14:03,843 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 0 to reach 3 (unavailableStorages=[], > storagePolicy=BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], > creationFallbacks=[], replicationFallbacks=[]}, newBlock=false) For more > information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > {code} > Then through some debug and simulation, I found the reason, and reproduction > this exception. > The reason is that some developer use COLD storage policy and mover, but the > operatiosn of setting storage policy and mover are asynchronous. So some > file's real datanodestorages are not match with this storagePolicy. > Let me simualte this proccess. If /tmp/a is create, then have 2 replications > are DISK. Then set storage policy to COLD. When some logical trigger(For > example decommission) to copy this block. chooseTarget then use > chooseStorageTypes to filter real needed block. Here the size of variable > requiredStorageTypes which chooseStorageTypes returned is 3. But the size of > result is 2. But 3 means need 3 ARCHIVE storage. 2 means bocks has 2 DISK > storage. Then will request to choose 3 target. choose first target is right, > but when choose seconde target, the variable 'counter' is 4 which is larger > than maxTargetPerRack which is 3 in function isGoodTarget. So skip all > datanodestorage. Then result to bad performance. > I think chooseStorageTypes need to consider the result, when the exist > replication doesn't meet storage policy's demand, we need to remove this from > result. > I changed by this way, and test in my unit-test. Then solve it. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org