[ 
https://issues.apache.org/jira/browse/HDFS-16557?focusedWorklogId=780179&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780179
 ]

ASF GitHub Bot logged work on HDFS-16557:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Jun/22 02:29
            Start Date: 10/Jun/22 02:29
    Worklog Time Spent: 10m 
      Work Description: tomscut commented on PR #4219:
URL: https://github.com/apache/hadoop/pull/4219#issuecomment-1151852402

   > Thanks @tomscut , after tracing the code, I think we cannot add 
`elis.isInProgress()`.
   > 
   > And I will explain my ideas trough questions and answers. **Question one: 
Why was INVALID_TXID considered in the original code?**
   > 
   > * CheckForGaps method is used to check whether streams contains continuous 
TXids from fromTxId to toAtLeastTxid
   > * LastTxId equals INVALID_TXID means the stream is in progress
   > * toAtLeastTxid maybe abnormal value, like Long.MaxValue.  So the 
CheckForGaps method only need to cover the latest inprogress segment.
   > 
   > **Question two: What is the difference between INVALID_TXID and is 
InProgress()?**
   > 
   > * Before introducing [SBN READ], LastTxId equals INVALID_TXID means the 
stream is in progress. And stream is in progress means it's lastTxId is 
INVALID_TXID.
   > * But after introducing [SBN READ], LastTxId equals INVALID_TXID means the 
stream is in progress. But stream is in progress cannot mean it's lastTxId is 
INVALID_TXID. Because introducing getJournaledEdits.
   > * So if we add `elis.isInProgress()` in CheckForGaps, it cannot cover the 
last writing segments which actual contains latest edit.
   > 
   > Please correct me if anything is wrong.
   
   Thanks @ZanderXu for your comment. Please refer to the stack.
   
![image](https://user-images.githubusercontent.com/55134131/172977547-16c0bf94-8586-4f41-be8e-ce1e4dd41eae.png)
   
   When we set `dfs.ha.tail-edits.in-progress=true`, the txID can be read by 
getJournaledEdits (there is no gap actually) . But there is an GAP exception 
thrown.
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 780179)
    Time Spent: 3h 10m  (was: 3h)

> BootstrapStandby failed because of checking gap for inprogress 
> EditLogInputStream
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-16557
>                 URL: https://issues.apache.org/jira/browse/HDFS-16557
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Tao Li
>            Assignee: Tao Li
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2022-04-22-17-17-14-577.png, 
> image-2022-04-22-17-17-14-618.png, image-2022-04-22-17-17-23-113.png, 
> image-2022-04-22-17-17-32-487.png
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The lastTxId of an inprogress EditLogInputStream lastTxId isn't necessarily 
> HdfsServerConstants.INVALID_TXID. We can determine its status directly by 
> EditLogInputStream#isInProgress.
> We introduced [SBN READ], and set 
> {color:#ff0000}{{dfs.ha.tail-edits.in-progress=true}}{color}. Then 
> bootstrapStandby, the EditLogInputStream of inProgress is misjudged, 
> resulting in a gap check failure, which causes bootstrapStandby to fail.
> hdfs namenode -bootstrapStandby
> !image-2022-04-22-17-17-32-487.png|width=766,height=161!
> !image-2022-04-22-17-17-14-577.png|width=598,height=187!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to