[ 
https://issues.apache.org/jira/browse/HDFS-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840294#comment-17840294
 ] 

Ayush Saxena commented on HDFS-17477:
-------------------------------------

Hi [~dannytbecker] 

Seems like since this got committed 
TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit is failing 

ref:

[https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1564/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestLargeBlockReport/testBlockReportSucceedsWithLargerLengthLimit/]

 

It did fail once in the Jenkins result of this PR as well:

[https://github.com/apache/hadoop/pull/6748#issuecomment-2063042088]

 

But in the successive build, I am not sure if it ran or not. 

 

Tried locally, with this in locally it was failing with OOM, I reverted it & it 
passed.

Can you check once?

> IncrementalBlockReport race condition additional edge cases
> -----------------------------------------------------------
>
>                 Key: HDFS-17477
>                 URL: https://issues.apache.org/jira/browse/HDFS-17477
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: auto-failover, ha, namenode
>    Affects Versions: 3.3.5, 3.3.4, 3.3.6
>            Reporter: Danny Becker
>            Assignee: Danny Becker
>            Priority: Major
>              Labels: pull-request-available
>
> HDFS-17453 fixes a race condition between IncrementalBlockReports (IBR) and 
> the Edit Log Tailer which can cause the Standby NameNode (SNN) to incorrectly 
> mark blocks as corrupt when it transitions to Active. There are a few edge 
> cases that HDFS-17453 does not cover.
> For Example:
> 1. SNN1 loads the edits for b1gs1 and b1gs2.
> 2. DN1 reports b1gs1 to SNN1, so it gets queued for later processing.
> 3. DN1 reports b1gs2 to SNN1 so it gets added to the blocks map.
> 4. SNN1 transitions to Active (ANN1).
> 5. ANN1 processes the pending DN message queue and marks DN1->b1gs1 as 
> corrupt because it was still in the queue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to