[jira] [Work logged] (HDFS-16316) Improve DirectoryScanner: add regular file check related block

ASF GitHub Bot (Jira) Wed, 16 Feb 2022 21:56:08 -0800


     [ 
https://issues.apache.org/jira/browse/HDFS-16316?focusedWorklogId=728710&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-728710
 ]


ASF GitHub Bot logged work on HDFS-16316:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Feb/22 05:55
            Start Date: 17/Feb/22 05:55
    Worklog Time Spent: 10m 
      Work Description: jianghuazhu commented on a change in pull request #3861:
URL: https://github.com/apache/hadoop/pull/3861#discussion_r808697481



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
##########
@@ -2812,6 +2816,9 @@ public void checkAndUpdate(String bpid, ScanInfo scanInfo)
             + memBlockInfo.getNumBytes() + " to "
             + memBlockInfo.getBlockDataLength());
         memBlockInfo.setNumBytes(memBlockInfo.getBlockDataLength());
+      } else if (!isRegular) {
+        corruptBlock = new Block(memBlockInfo);
+        LOG.warn("Block:{} is not a regular file.", corruptBlock.getBlockId());

Review comment:
       Thanks @tomscut  for the comment and review.
   This happens occasionally, I've been monitoring it for a long time and still 
haven't found the root cause.
   But I think this situation may be related to the Linux environment. When the 
normal data flow is working, no exception occurs. (I will continue to monitor 
this situation)
   Here are some more canonical checks to prevent further worse cases on the 
cluster. This is a good thing for clusters.
   
   When the file is actually cleaned up, the specific path will be printed. 
Here are some examples of online clusters:
   `
   2022-02-15 11:24:12,856 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
 Deleted BP-xxxx blk_xxxx file 
/mnt/dfs/11/data/current/BP-xxxx.xxxx.xxxx/current/finalized/subdir0/subdir0/blk_xxxx
   `




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 728710)
    Time Spent: 4.5h  (was: 4h 20m)

> Improve DirectoryScanner: add regular file check related block
> --------------------------------------------------------------
>
>                 Key: HDFS-16316
>                 URL: https://issues.apache.org/jira/browse/HDFS-16316
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.9.2
>            Reporter: JiangHua Zhu
>            Assignee: JiangHua Zhu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Something unusual happened in the online environment.
> The DataNode is configured with 11 disks (${dfs.datanode.data.dir}). It is 
> normal for 10 disks to calculate the used capacity, and the calculated value 
> for the other 1 disk is much larger, which is very strange.
> This is about the live view on the NameNode:
>  !screenshot-1.png! 
> This is about the live view on the DataNode:
>  !screenshot-2.png! 
> We can look at the view on linux:
>  !screenshot-3.png! 
> There is a big gap here, regarding'/mnt/dfs/11/data'. This situation should 
> be prohibited from happening.
> I found that there are some abnormal block files.
> There are wrong blk_xxxx.meta in some subdir directories, causing abnormal 
> computing space.
> Here are some abnormal block files:
>  !screenshot-4.png! 
> Such files should not be used as normal blocks. They should be actively 
> identified and filtered, which is good for cluster stability.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16316) Improve DirectoryScanner: add regular file check related block

Reply via email to