[ 
https://issues.apache.org/jira/browse/HDFS-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14806:
-------------------------------
    Attachment: HDFS-14806.002.patch

> Bootstrap standby may fail if used in-progress tailing
> ------------------------------------------------------
>
>                 Key: HDFS-14806
>                 URL: https://issues.apache.org/jira/browse/HDFS-14806
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.3.0
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>            Priority: Major
>         Attachments: HDFS-14806.001.patch, HDFS-14806.002.patch
>
>
> One issue we went across was that if in-progress tailing is enabled, 
> bootstrap standby could fail.
> When in-progress tailing is enabled, Bootstrap uses the RPC mechanism to get 
> edits. There is a config {{dfs.ha.tail-edits.qjm.rpc.max-txns}} that sets an 
> upper bound on how many txnid can be included in one RPC call. The default is 
> 5000. Meaning bootstraping NN (say NN1) can only pull at most 5000 edits from 
> JN. However, as part of bootstrap, NN1 queries another NN (say NN2) for NN2's 
> current transactionID, NN2 may return a state that is > 5000 txnid from NN1's 
> current image. But NN1 can only see 5000 more txnid from JNs. At this point 
> NN1 goes panic, because txnid retuned by JNs is behind NN2's returned state, 
> bootstrap then fail.
> Essentially, bootstrap standby can fail if both of two following conditions 
> are met:
>  # in-progress tailing is enabled AND
>  # the boostraping NN is too far (>5000 txid)  behind 
> Increasing the value of {{dfs.ha.tail-edits.qjm.rpc.max-txns}} to some super 
> large value allowed bootstrap to continue. But this is hardly the ideal 
> solution.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to