[ 
https://issues.apache.org/jira/browse/HDFS-16316?focusedWorklogId=729563&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-729563
 ]

ASF GitHub Bot logged work on HDFS-16316:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Feb/22 10:05
            Start Date: 18/Feb/22 10:05
    Worklog Time Spent: 10m 
      Work Description: ferhui commented on a change in pull request #3861:
URL: https://github.com/apache/hadoop/pull/3861#discussion_r809627226



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
##########
@@ -540,21 +541,30 @@ private void scan() {
           m++;
           continue;
         }
-        // Block file and/or metadata file exists on the disk
-        // Block exists in memory
-        if (info.getBlockFile() == null) {
-          // Block metadata file exits and block file is missing
-          addDifference(diffRecord, statsRecord, info);
-        } else if (info.getGenStamp() != memBlock.getGenerationStamp()
-            || info.getBlockLength() != memBlock.getNumBytes()) {
-          // Block metadata file is missing or has wrong generation stamp,
-          // or block file length is different than expected
+
+        // Block and meta must be regular file
+        boolean isRegular = FileUtil.isRegularFile(info.getBlockFile(), false) 
&&

Review comment:
       Thanks for your contribution. I have one question here.
   As far as i know, many companies use fast copy. It means that some block 
files are links.
   Does it affect them?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 729563)
    Time Spent: 5h 40m  (was: 5.5h)

> Improve DirectoryScanner: add regular file check related block
> --------------------------------------------------------------
>
>                 Key: HDFS-16316
>                 URL: https://issues.apache.org/jira/browse/HDFS-16316
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.9.2
>            Reporter: JiangHua Zhu
>            Assignee: JiangHua Zhu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>          Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Something unusual happened in the online environment.
> The DataNode is configured with 11 disks (${dfs.datanode.data.dir}). It is 
> normal for 10 disks to calculate the used capacity, and the calculated value 
> for the other 1 disk is much larger, which is very strange.
> This is about the live view on the NameNode:
>  !screenshot-1.png! 
> This is about the live view on the DataNode:
>  !screenshot-2.png! 
> We can look at the view on linux:
>  !screenshot-3.png! 
> There is a big gap here, regarding'/mnt/dfs/11/data'. This situation should 
> be prohibited from happening.
> I found that there are some abnormal block files.
> There are wrong blk_xxxx.meta in some subdir directories, causing abnormal 
> computing space.
> Here are some abnormal block files:
>  !screenshot-4.png! 
> Such files should not be used as normal blocks. They should be actively 
> identified and filtered, which is good for cluster stability.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to