[ https://issues.apache.org/jira/browse/HDFS-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403006#comment-17403006 ]
Max Xie edited comment on HDFS-16182 at 8/24/21, 2:55 AM: ----------------------------------------------------------- In my cluster, we use BlockPlacementPolicyDefault to choose dn and the number of SSD DN is much less than DISK DN. It may cause to some block that should be placed to SSD DNs fallback to place DISK DNs when SSD DNs are too busy or no enough place. The steps are as follow. # Create empty file /foo_file # Set its storagepolicy to All_SSD # Put data to /foo_file # /foo_file gets 3 DISK dns for pipeline because SSD dns are too busy at the beginning. # When it transfers data in pipeline, one of 3 DISK dns shut down. # The client need to get one new dn for existing pipeline. # If SSD dns are available at the moment, namenode will choose the 3 SSD dns and return it to the client. However, the client just need one new dn, namenode returns 3 new SSD dn and the client fail. was (Author: max2049): In my cluster, we use BlockPlacementPolicyDefault to choose dn and the number of SSD DN is much less than DISK DN. It may cause to some block that should be placed to SSD DNs fallback to place DISK DNs when SSD DNs are too busy or no enough place. # create empty file /foo_file # set its storagepolicy to All_SSD # put data to /foo_file # /foo_file gets 3 DISK dns for pipeline because SSD dns are too busy at the beginning. # when it transfers data in pipeline, one of 3 DISK dns down and [shut|http://dict.youdao.com/search?q=shut&keyfrom=chrome.extension] [ʃʌt] [详细|http://dict.youdao.com/search?q=shut&keyfrom=chrome.extension]X 基本翻译 vt. 关闭;停业;幽禁 vi. 关上;停止营业 n. 关闭 adj. 关闭的;围绕的 n. (Shut)人名;(俄)舒特;(中)室(广东话·威妥玛) 网络释义 [Shut:|http://dict.youdao.com/search?q=Shut&keyfrom=chrome.extension&le=eng] 此路不通 [Eyes Wide Shut:|http://dict.youdao.com/search?q=Eyes%20Wide%20Shut&keyfrom=chrome.extension&le=eng] 大开眼戒 [shut out:|http://dict.youdao.com/search?q=shut%20out&keyfrom=chrome.extension&le=eng] 排除 > numOfReplicas is given the wrong value in > BlockPlacementPolicyDefault$chooseTarget can cause DataStreamer to fail with > Heterogeneous Storage > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-16182 > URL: https://issues.apache.org/jira/browse/HDFS-16182 > Project: Hadoop HDFS > Issue Type: Bug > Components: namanode > Affects Versions: 3.4.0 > Reporter: Max Xie > Assignee: Max Xie > Priority: Major > Labels: pull-request-available > Attachments: HDFS-16182.patch > > Time Spent: 20m > Remaining Estimate: 0h > > In our hdfs cluster, we use heterogeneous storage to store data in SSD for a > better performance. Sometimes hdfs client transfer data in pipline, it will > throw IOException and exit. Exception logs are below: > ``` > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[dn01_ip:5004,DS-ef7882e0-427d-4c1e-b9ba-a929fac44fb4,DISK], > > DatanodeInfoWithStorage[dn02_ip:5004,DS-3871282a-ad45-4332-866a-f000f9361ecb,DISK], > > DatanodeInfoWithStorage[dn03_ip:5004,DS-a388c067-76a4-4014-a16c-ccc49c8da77b,SSD], > > DatanodeInfoWithStorage[dn04_ip:5004,DS-b81da262-0dd9-4567-a498-c516fab84fe0,SSD], > > DatanodeInfoWithStorage[dn05_ip:5004,DS-34e3af2e-da80-46ac-938c-6a3218a646b9,SSD]], > > original=[DatanodeInfoWithStorage[dn01_ip:5004,DS-ef7882e0-427d-4c1e-b9ba-a929fac44fb4,DISK], > > DatanodeInfoWithStorage[dn02_ip:5004,DS-3871282a-ad45-4332-866a-f000f9361ecb,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > ``` > After search it, I found when existing pipline need replace new dn to > transfer data, the client will get one additional dn from namenode and check > that the number of dn is the original number + 1. > ``` > ## DataStreamer$findNewDatanode > if (nodes.length != original.length + 1) { > throw new IOException( > "Failed to replace a bad datanode on the existing pipeline " > + "due to no more good datanodes being available to try. " > + "(Nodes: current=" + Arrays.asList(nodes) > + ", original=" + Arrays.asList(original) + "). " > + "The current failed datanode replacement policy is " > + dfsClient.dtpReplaceDatanodeOnFailure > + ", and a client may configure this via '" > + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY > + "' in its configuration."); > } > ``` > The root cause is that Namenode$getAdditionalDatanode returns multi datanodes > , not one in DataStreamer.addDatanode2ExistingPipeline. > > Maybe we can fix it in BlockPlacementPolicyDefault$chooseTarget. I think > numOfReplicas should not be assigned by requiredStorageTypes. > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org