[ https://issues.apache.org/jira/browse/HDFS-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923716#comment-16923716 ]
Erik Krogen commented on HDFS-14806: ------------------------------------ Nice work!! Very elegant :) > Bootstrap standby may fail if used in-progress tailing > ------------------------------------------------------ > > Key: HDFS-14806 > URL: https://issues.apache.org/jira/browse/HDFS-14806 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.3.0 > Reporter: Chen Liang > Assignee: Chen Liang > Priority: Major > Attachments: HDFS-14806.001.patch, HDFS-14806.002.patch, > HDFS-14806.003.patch > > > One issue we went across was that if in-progress tailing is enabled, > bootstrap standby could fail. > When in-progress tailing is enabled, Bootstrap uses the RPC mechanism to get > edits. There is a config {{dfs.ha.tail-edits.qjm.rpc.max-txns}} that sets an > upper bound on how many txnid can be included in one RPC call. The default is > 5000. Meaning bootstraping NN (say NN1) can only pull at most 5000 edits from > JN. However, as part of bootstrap, NN1 queries another NN (say NN2) for NN2's > current transactionID, NN2 may return a state that is > 5000 txnid from NN1's > current image. But NN1 can only see 5000 more txnid from JNs. At this point > NN1 goes panic, because txnid retuned by JNs is behind NN2's returned state, > bootstrap then fail. > Essentially, bootstrap standby can fail if both of two following conditions > are met: > # in-progress tailing is enabled AND > # the boostraping NN is too far (>5000 txid) behind > Increasing the value of {{dfs.ha.tail-edits.qjm.rpc.max-txns}} to some super > large value allowed bootstrap to continue. But this is hardly the ideal > solution. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org