[jira] [Updated] (HDFS-16420) ec + balancer may cause missing block

qinyuren (Jira) Mon, 10 Jan 2022 02:29:05 -0800


     [ 
https://issues.apache.org/jira/browse/HDFS-16420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


qinyuren updated HDFS-16420:
----------------------------
    Description: 
We have a similar problem as HDFS-16297 described. 

In our cluster, we used {color:#de350b}ec(6+3) + balancer with version 
3.1.0{color}, and the {color:#de350b}missing block{color} happened. 

We got the block(blk_-9223372036824119008) info from fsck, only 5 live 
replications and multiple redundant replications. 
{code:java}
blk_-9223372036824119008_220037616 len=133370338 MISSING! Live_repl=5
blk_-9223372036824119007:DatanodeInfoWithStorage,   
blk_-9223372036824119002:DatanodeInfoWithStorage,    
blk_-9223372036824119001:DatanodeInfoWithStorage,  
blk_-9223372036824119000:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage,  
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage {code}
   

We searched the log from all datanode, and found that the internal blocks of 
blk_-9223372036824119008 were deleted almost at the same time.

 
{code:java}
08:15:58,550 INFO  impl.FsDatasetAsyncDiskService 
(FsDatasetAsyncDiskService.java:run(333)) - Deleted 
BP-1606066499-xxxx-1606188026755 blk_-9223372036824119008_220037616 URI 
file:/data15/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119008

08:16:21,214 INFO  impl.FsDatasetAsyncDiskService 
(FsDatasetAsyncDiskService.java:run(333)) - Deleted 
BP-1606066499-xxxx-1606188026755 blk_-9223372036824119006_220037616 URI 
file:/data4/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119006

08:16:55,737 INFO  impl.FsDatasetAsyncDiskService 
(FsDatasetAsyncDiskService.java:run(333)) - Deleted 
BP-1606066499-xxxx-1606188026755 blk_-9223372036824119005_220037616 URI 
file:/data2/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119005
{code}
 

The total number of internal blocks deleted during 08:15-08:17 are as follows
||internal block||    delete num||
|blk_-9223372036824119008      
blk_-9223372036824119006         
blk_-9223372036824119005         
blk_-9223372036824119004         
blk_-9223372036824119003         
blk_-9223372036824119000        |        1
        1
        1  
        50
        1
        1|

 

{color:#ff0000}During 08:15 to 08:17, we restarted 2 datanode and triggered 
full block report immediately.{color}

 

There are 2 questions: 
1. Why are there so many replicas of this block?
2. Why delete the internal block with only one copy?

The reasons for the first problem may be as follows: 
1. We set the full block report period of some datanode to 168 hours.
2. We have done a namenode HA operation.
3. After namenode HA, the state of storage became {color:#ff0000}stale{color}, 
and the state not change until next full block report.
4. The balancer copied the replica without deleting the replica from source 
node, because the source node have the stale storage, and the request was put 
into {color:#ff0000}postponedMisreplicatedBlocks{color}.

5. Balancer continues to copy the replica, eventually resulting in multiple 
copies of a replica

!image-2022-01-10-17-31-35-910.png|width=642,height=269!

The set of {color:#ff0000}rescannedMisreplicatedBlocks{color} have so many 
block to remove.

!image-2022-01-10-17-32-56-981.png|width=745,height=124!

As for the second question, we checked the code of 
{color:#de350b}processExtraRedundancyBlock{color}, but didn't find any problem.

 

  was:
We have a similar problem as HDFS-16297 described. 

In our cluster, we used {color:#de350b}ec(6+3) + balancer{color}, and the 
{color:#de350b}missing block{color} happened. 

We got the block(blk_-9223372036824119008) info from fsck, only 5 live 
replications and multiple redundant replications. 
{code:java}
blk_-9223372036824119008_220037616 len=133370338 MISSING! Live_repl=5
blk_-9223372036824119007:DatanodeInfoWithStorage,   
blk_-9223372036824119002:DatanodeInfoWithStorage,    
blk_-9223372036824119001:DatanodeInfoWithStorage,  
blk_-9223372036824119000:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage,  
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage, 
blk_-9223372036824119004:DatanodeInfoWithStorage {code}
   

We searched the log from all datanode, and found that the internal blocks of 
blk_-9223372036824119008 were deleted almost at the same time.

 
{code:java}
08:15:58,550 INFO  impl.FsDatasetAsyncDiskService 
(FsDatasetAsyncDiskService.java:run(333)) - Deleted 
BP-1606066499-xxxx-1606188026755 blk_-9223372036824119008_220037616 URI 
file:/data15/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119008

08:16:21,214 INFO  impl.FsDatasetAsyncDiskService 
(FsDatasetAsyncDiskService.java:run(333)) - Deleted 
BP-1606066499-xxxx-1606188026755 blk_-9223372036824119006_220037616 URI 
file:/data4/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119006

08:16:55,737 INFO  impl.FsDatasetAsyncDiskService 
(FsDatasetAsyncDiskService.java:run(333)) - Deleted 
BP-1606066499-xxxx-1606188026755 blk_-9223372036824119005_220037616 URI 
file:/data2/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119005
{code}
 

The total number of internal blocks deleted during 08:15-08:17 are as follows
||internal block||    delete num||
|blk_-9223372036824119008      
blk_-9223372036824119006         
blk_-9223372036824119005         
blk_-9223372036824119004         
blk_-9223372036824119003         
blk_-9223372036824119000        |        1
        1
        1  
        50
        1
        1|

 

{color:#ff0000}During 08:15 to 08:17, we restarted 2 datanode and triggered 
full block report immediately.{color}

 

There are 2 questions: 
1. Why are there so many replicas of this block?
2. Why delete the internal block with only one copy?

The reasons for the first problem may be as follows: 
1. We set the full block report period of some datanode to 168 hours.
2. We have done a namenode HA operation.
3. After namenode HA, the state of storage became {color:#ff0000}stale{color}, 
and the state not change until next full block report.
4. The balancer copied the replica without deleting the replica from source 
node, because the source node have the stale storage, and the request was put 
into {color:#ff0000}postponedMisreplicatedBlocks{color}.

5. Balancer continues to copy the replica, eventually resulting in multiple 
copies of a replica

!image-2022-01-10-17-31-35-910.png|width=642,height=269!

The set of {color:#ff0000}rescannedMisreplicatedBlocks{color} have so many 
block to remove.

!image-2022-01-10-17-32-56-981.png|width=745,height=124!

As for the second question, we checked the code of 
{color:#de350b}processExtraRedundancyBlock{color}, but didn't find any problem.

 


> ec + balancer may cause missing block
> -------------------------------------
>
>                 Key: HDFS-16420
>                 URL: https://issues.apache.org/jira/browse/HDFS-16420
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: qinyuren
>            Priority: Major
>         Attachments: image-2022-01-10-17-31-35-910.png, 
> image-2022-01-10-17-32-56-981.png
>
>
> We have a similar problem as HDFS-16297 described. 
> In our cluster, we used {color:#de350b}ec(6+3) + balancer with version 
> 3.1.0{color}, and the {color:#de350b}missing block{color} happened. 
> We got the block(blk_-9223372036824119008) info from fsck, only 5 live 
> replications and multiple redundant replications. 
> {code:java}
> blk_-9223372036824119008_220037616 len=133370338 MISSING! Live_repl=5
> blk_-9223372036824119007:DatanodeInfoWithStorage,   
> blk_-9223372036824119002:DatanodeInfoWithStorage,    
> blk_-9223372036824119001:DatanodeInfoWithStorage,  
> blk_-9223372036824119000:DatanodeInfoWithStorage, 
> blk_-9223372036824119004:DatanodeInfoWithStorage,  
> blk_-9223372036824119004:DatanodeInfoWithStorage, 
> blk_-9223372036824119004:DatanodeInfoWithStorage, 
> blk_-9223372036824119004:DatanodeInfoWithStorage, 
> blk_-9223372036824119004:DatanodeInfoWithStorage, 
> blk_-9223372036824119004:DatanodeInfoWithStorage {code}
>    
> We searched the log from all datanode, and found that the internal blocks of 
> blk_-9223372036824119008 were deleted almost at the same time.
>  
> {code:java}
> 08:15:58,550 INFO  impl.FsDatasetAsyncDiskService 
> (FsDatasetAsyncDiskService.java:run(333)) - Deleted 
> BP-1606066499-xxxx-1606188026755 blk_-9223372036824119008_220037616 URI 
> file:/data15/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119008
> 08:16:21,214 INFO  impl.FsDatasetAsyncDiskService 
> (FsDatasetAsyncDiskService.java:run(333)) - Deleted 
> BP-1606066499-xxxx-1606188026755 blk_-9223372036824119006_220037616 URI 
> file:/data4/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119006
> 08:16:55,737 INFO  impl.FsDatasetAsyncDiskService 
> (FsDatasetAsyncDiskService.java:run(333)) - Deleted 
> BP-1606066499-xxxx-1606188026755 blk_-9223372036824119005_220037616 URI 
> file:/data2/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119005
> {code}
>  
> The total number of internal blocks deleted during 08:15-08:17 are as follows
> ||internal block||    delete num||
> |blk_-9223372036824119008      
> blk_-9223372036824119006         
> blk_-9223372036824119005         
> blk_-9223372036824119004         
> blk_-9223372036824119003         
> blk_-9223372036824119000        |        1
>         1
>         1  
>         50
>         1
>         1|
>  
> {color:#ff0000}During 08:15 to 08:17, we restarted 2 datanode and triggered 
> full block report immediately.{color}
>  
> There are 2 questions: 
> 1. Why are there so many replicas of this block?
> 2. Why delete the internal block with only one copy?
> The reasons for the first problem may be as follows: 
> 1. We set the full block report period of some datanode to 168 hours.
> 2. We have done a namenode HA operation.
> 3. After namenode HA, the state of storage became 
> {color:#ff0000}stale{color}, and the state not change until next full block 
> report.
> 4. The balancer copied the replica without deleting the replica from source 
> node, because the source node have the stale storage, and the request was put 
> into {color:#ff0000}postponedMisreplicatedBlocks{color}.
> 5. Balancer continues to copy the replica, eventually resulting in multiple 
> copies of a replica
> !image-2022-01-10-17-31-35-910.png|width=642,height=269!
> The set of {color:#ff0000}rescannedMisreplicatedBlocks{color} have so many 
> block to remove.
> !image-2022-01-10-17-32-56-981.png|width=745,height=124!
> As for the second question, we checked the code of 
> {color:#de350b}processExtraRedundancyBlock{color}, but didn't find any 
> problem.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16420) ec + balancer may cause missing block

Reply via email to