Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
hadoop-yetus commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-2076543325 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 02s | | No case conflicting files found. | | +0 :ok: | spotbugs | 0m 00s | | spotbugs executables are not available. | | +0 :ok: | codespell | 0m 01s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 01s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 01s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 00s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 00s | | The patch appears to include 5 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 97m 05s | | trunk passed | | +1 :green_heart: | compile | 6m 28s | | trunk passed | | +1 :green_heart: | checkstyle | 5m 07s | | trunk passed | | +1 :green_heart: | mvnsite | 7m 08s | | trunk passed | | +1 :green_heart: | javadoc | 6m 36s | | trunk passed | | +1 :green_heart: | shadedclient | 153m 07s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 4m 48s | | the patch passed | | +1 :green_heart: | compile | 3m 42s | | the patch passed | | +1 :green_heart: | javac | 3m 42s | | the patch passed | | +1 :green_heart: | blanks | 0m 00s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 2m 36s | | the patch passed | | +1 :green_heart: | mvnsite | 4m 29s | | the patch passed | | +1 :green_heart: | javadoc | 3m 50s | | the patch passed | | +1 :green_heart: | shadedclient | 167m 08s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | asflicense | 5m 41s | | The patch does not generate ASF License warnings. | | | | 445m 34s | | | | Subsystem | Report/Notes | |--:|:-| | GITHUB PR | https://github.com/apache/hadoop/pull/6559 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint | | uname | MINGW64_NT-10.0-17763 d777404a85ec 3.4.10-87d57229.x86_64 2024-02-14 20:17 UTC x86_64 Msys | | Build tool | maven | | Personality | /c/hadoop/dev-support/bin/hadoop.sh | | git revision | trunk / cada170bb5603c7921ae6eae957dff26deb76c5d | | Default Java | Azul Systems, Inc.-1.8.0_332-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6559/1/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6559/1/console | | versions | git=2.44.0.windows.1 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-2026904879 Hi @zhangshuyan0 please help me review again this PR when you are free, thanks ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1993179784 > Thanks involve me here. I think @zhangshuyan0 should be more professional about this improvement. Let's wait her/his feedback. ok, thanks for your comment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
Hexiaoqiao commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1993154964 Thanks involve me here. I think @zhangshuyan0 should be more professional about this improvement. Let's wait her/his feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1991339659 Hi @Hexiaoqiao Sir would you mind look it again, thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1515437309 ## hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml: ## @@ -3982,6 +3982,17 @@ + + dfs.datanode.delete.corrupt.replica.from.disk.enable + true Review Comment: Hi @zhangshuyan0 Would you mind look it again, thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1505502474 ## hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml: ## @@ -3982,6 +3982,17 @@ + + dfs.datanode.delete.corrupt.replica.from.disk.enable + true Review Comment: Thanks @zhangshuyan0 for your comment. From the DataNode point of view, if already confirmed the meta file or data file is lost. it should be deleted directly from the memory and disk and this is expected behavior. For HDFS-16985 mentioned, if the current cluster deployment adopts the AWS EC2 + EBS solution, can adjust `dfs.datanode.delete.corrupt.replica.from.disk.enable` is false as needed to avoid deleting disk data. So I think it might be better from datanode perspective default to set `dfs.datanode.delete.corrupt.replica.from.disk.enable` to true looking forward to your suggestions again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
zhangshuyan0 commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1503587650 ## hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml: ## @@ -3982,6 +3982,17 @@ + + dfs.datanode.delete.corrupt.replica.from.disk.enable + true Review Comment: If the default value is true, there is a risk of block missing according to [HDFS-16985](https://issues.apache.org/jira/browse/HDFS-16985). I suggest setting the default value to false, as block missing is a more serious problem than disk file deletion delay. What's your opion? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1965674193 Hi @ZanderXu @tomscut @zhangshuyan0 @tasanuma please help me review again this PR when you are free, thanks ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1962332533 The failed tests seem not to be related. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
hadoop-yetus commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1962286525 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 22s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 5 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 50m 31s | | trunk passed | | +1 :green_heart: | compile | 0m 41s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 0m 40s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 40s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 47s | | trunk passed | | +1 :green_heart: | javadoc | 0m 45s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 2s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 41s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 37s | | the patch passed | | +1 :green_heart: | compile | 0m 36s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 0m 36s | | the patch passed | | +1 :green_heart: | compile | 0m 34s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 34s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 31s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 37s | | the patch passed | | +1 :green_heart: | javadoc | 0m 32s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 0m 59s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 43s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 30s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 204m 3s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6559/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 30s | | The patch does not generate ASF License warnings. | | | | 310m 8s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.tools.TestDFSAdmin | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6559/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6559 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint | | uname | Linux e1573661312f 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / cada170bb5603c7921ae6eae957dff26deb76c5d | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6559/5/testReport/ | | Max. process+thread count | 4349 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output |
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1959255049 Update PR to support dynamically configured. Hi @ZanderXu @tomscut please help review it again, thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1498534965 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java: ## @@ -188,6 +188,10 @@ public class DFSConfigKeys extends CommonConfigurationKeys { public static final long DFS_DN_CACHED_DFSUSED_CHECK_INTERVAL_DEFAULT_MS = 60; + public static final String DFS_DATANODE_DELETE_CORRUPT_REPLICA_FROM_DISK_ENABLE = Review Comment: Thanks @tomscut for your review. I wll support dynamically configured later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
tomscut commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1497454155 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java: ## @@ -188,6 +188,10 @@ public class DFSConfigKeys extends CommonConfigurationKeys { public static final long DFS_DN_CACHED_DFSUSED_CHECK_INTERVAL_DEFAULT_MS = 60; + public static final String DFS_DATANODE_DELETE_CORRUPT_REPLICA_FROM_DISK_ENABLE = Review Comment: It would be handy if this could be configured dynamically. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on PR #6559: URL: https://github.com/apache/hadoop/pull/6559#issuecomment-1955978167 The result of the CI is here. The failed tests seem not to be related. https://ci-hadoop.apache.org/blue/organizations/jenkins/hadoop-multibranch/detail/PR-6559/3/tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1494250286 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java: ## @@ -2400,37 +2407,42 @@ public void invalidate(String bpid, ReplicaInfo block) { } /** * Invalidate a block which is not found on disk. We should remove it from - * memory and notify namenode, but unnecessary to delete the actual on-disk - * block file again. + * memory and notify namenode, will decide whether to delete the actual on-disk block and meta + * file based on {@link DFSConfigKeys#DFS_DATANODE_DELETE_CORRUPT_REPLICA_FROM_DISK_ENABLE}. * * @param bpid the block pool ID. * @param block The block to be invalidated. * @param checkFiles Whether to check data and meta files. */ - public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) { + public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) + throws IOException { // The replica seems is on its volume map but not on disk. // We can't confirm here is block file lost or disk failed. -// If block lost: -//deleted local block file is completely unnecessary -// If disk failed: -//deleted local block file here may lead to missing-block -//when it with only 1 replication left now. -// So remove if from volume map notify namenode is ok. +// If checkFiles as true will check block or meta file existence again. +// If deleteCorruptReplicaFromDisk as true will delete the actual on-disk block and meta file, +// otherwise will remove it from volume map and notify namenode. try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, bpid)) { // Check if this block is on the volume map. ReplicaInfo replica = volumeMap.get(bpid, block); // Double-check block or meta file existence when checkFiles as true. if (replica != null && (!checkFiles || (!replica.blockDataExists() || !replica.metadataExists( { -volumeMap.remove(bpid, block); -invalidate(bpid, replica); +if (deleteCorruptReplicaFromDisk) { + ExtendedBlock extendedBlock = new ExtendedBlock(bpid, block); + datanode + .notifyNamenodeDeletedBlock(extendedBlock, replica.getStorageUuid()); + invalidate(bpid, new Block[] {extendedBlock.getLocalBlock()}); +} else { + volumeMap.remove(bpid, block); Review Comment: Modify patch based on comments. Hi @ZanderXu please help review it again, thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
ZanderXu commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1493952635 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java: ## @@ -2400,37 +2407,42 @@ public void invalidate(String bpid, ReplicaInfo block) { } /** * Invalidate a block which is not found on disk. We should remove it from - * memory and notify namenode, but unnecessary to delete the actual on-disk - * block file again. + * memory and notify namenode, will decide whether to delete the actual on-disk block and meta + * file based on {@link DFSConfigKeys#DFS_DATANODE_DELETE_CORRUPT_REPLICA_FROM_DISK_ENABLE}. * * @param bpid the block pool ID. * @param block The block to be invalidated. * @param checkFiles Whether to check data and meta files. */ - public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) { + public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) + throws IOException { // The replica seems is on its volume map but not on disk. // We can't confirm here is block file lost or disk failed. -// If block lost: -//deleted local block file is completely unnecessary -// If disk failed: -//deleted local block file here may lead to missing-block -//when it with only 1 replication left now. -// So remove if from volume map notify namenode is ok. +// If checkFiles as true will check block or meta file existence again. +// If deleteCorruptReplicaFromDisk as true will delete the actual on-disk block and meta file, +// otherwise will remove it from volume map and notify namenode. try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, bpid)) { // Check if this block is on the volume map. ReplicaInfo replica = volumeMap.get(bpid, block); // Double-check block or meta file existence when checkFiles as true. if (replica != null && (!checkFiles || (!replica.blockDataExists() || !replica.metadataExists( { -volumeMap.remove(bpid, block); -invalidate(bpid, replica); +if (deleteCorruptReplicaFromDisk) { + ExtendedBlock extendedBlock = new ExtendedBlock(bpid, block); + datanode + .notifyNamenodeDeletedBlock(extendedBlock, replica.getStorageUuid()); + invalidate(bpid, new Block[] {extendedBlock.getLocalBlock()}); Review Comment: `invalidate` this block first, then `notifyNameNodeDeletedBlock`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
ZanderXu commented on code in PR #6559: URL: https://github.com/apache/hadoop/pull/6559#discussion_r1493941213 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java: ## @@ -2400,37 +2407,42 @@ public void invalidate(String bpid, ReplicaInfo block) { } /** * Invalidate a block which is not found on disk. We should remove it from - * memory and notify namenode, but unnecessary to delete the actual on-disk - * block file again. + * memory and notify namenode, will decide whether to delete the actual on-disk block and meta + * file based on {@link DFSConfigKeys#DFS_DATANODE_DELETE_CORRUPT_REPLICA_FROM_DISK_ENABLE}. * * @param bpid the block pool ID. * @param block The block to be invalidated. * @param checkFiles Whether to check data and meta files. */ - public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) { + public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) + throws IOException { // The replica seems is on its volume map but not on disk. // We can't confirm here is block file lost or disk failed. -// If block lost: -//deleted local block file is completely unnecessary -// If disk failed: -//deleted local block file here may lead to missing-block -//when it with only 1 replication left now. -// So remove if from volume map notify namenode is ok. +// If checkFiles as true will check block or meta file existence again. +// If deleteCorruptReplicaFromDisk as true will delete the actual on-disk block and meta file, +// otherwise will remove it from volume map and notify namenode. Review Comment: ``` If checkFiles is true, the existence of the block and metafile will be checked again. If deleteCorruptReplicaFromDisk is true, delete the existing block or metafile directly, otherwise just remove them from the memory volumeMap. ``` ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java: ## @@ -2400,37 +2407,42 @@ public void invalidate(String bpid, ReplicaInfo block) { } /** * Invalidate a block which is not found on disk. We should remove it from - * memory and notify namenode, but unnecessary to delete the actual on-disk - * block file again. + * memory and notify namenode, will decide whether to delete the actual on-disk block and meta + * file based on {@link DFSConfigKeys#DFS_DATANODE_DELETE_CORRUPT_REPLICA_FROM_DISK_ENABLE}. * * @param bpid the block pool ID. * @param block The block to be invalidated. * @param checkFiles Whether to check data and meta files. */ - public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) { + public void invalidateMissingBlock(String bpid, Block block, boolean checkFiles) + throws IOException { // The replica seems is on its volume map but not on disk. // We can't confirm here is block file lost or disk failed. -// If block lost: -//deleted local block file is completely unnecessary -// If disk failed: -//deleted local block file here may lead to missing-block -//when it with only 1 replication left now. -// So remove if from volume map notify namenode is ok. +// If checkFiles as true will check block or meta file existence again. +// If deleteCorruptReplicaFromDisk as true will delete the actual on-disk block and meta file, +// otherwise will remove it from volume map and notify namenode. try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, bpid)) { // Check if this block is on the volume map. ReplicaInfo replica = volumeMap.get(bpid, block); // Double-check block or meta file existence when checkFiles as true. if (replica != null && (!checkFiles || (!replica.blockDataExists() || !replica.metadataExists( { -volumeMap.remove(bpid, block); -invalidate(bpid, replica); +if (deleteCorruptReplicaFromDisk) { + ExtendedBlock extendedBlock = new ExtendedBlock(bpid, block); + datanode + .notifyNamenodeDeletedBlock(extendedBlock, replica.getStorageUuid()); + invalidate(bpid, new Block[] {extendedBlock.getLocalBlock()}); +} else { + volumeMap.remove(bpid, block); Review Comment: Please add some comments to describe the necessity of the `else` logic. ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java: ## @@ -2400,37 +2407,42 @@ public void invalidate(String bpid, ReplicaInfo block) { } /** * Invalidate a block which is not found on disk. We should remove it from - * memory and notify namenode, but unnecessary to delete the actual on-disk - *
[PR] HDFS-17352. Add configuration to control whether DN delete this replica from disk when client requests a missing block [hadoop]
haiyang1987 opened a new pull request, #6559: URL: https://github.com/apache/hadoop/pull/6559 ### Description of PR https://issues.apache.org/jira/browse/HDFS-17352 As discussed at https://github.com/apache/hadoop/pull/6464#issuecomment-1902959898 we should add configuration to control whether DN delete this replica from disk when client requests a missing block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org