[ https://issues.apache.org/jira/browse/HDFS-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Junping Du updated HDFS-9119: ----------------------------- Target Version/s: (was: 2.8.0) > Discrepancy between edit log tailing interval and RPC timeout for > transitionToActive > ------------------------------------------------------------------------------------ > > Key: HDFS-9119 > URL: https://issues.apache.org/jira/browse/HDFS-9119 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha > Affects Versions: 2.7.1 > Reporter: Zhe Zhang > Assignee: Zhe Zhang > Attachments: HDFS-9119.00.patch > > > {{EditLogTailer}} on standby NameNode tails edits from active NameNode every > 2 minutes. But the {{transitionToActive}} RPC call has a timeout of 1 minute. > If active NameNode encounters very intensive metadata workload (in > particular, a lot of {{AddOp}} and {{MkDir}} operations to create new files > and directories), the amount of updates accumulated in the 2 mins edit log > tailing interval is hard for the standby NameNode to catch up in the 1 min > timeout window. If that happens, the FailoverController will timeout and give > up trying to transition the standby to active. The old ANN will resume adding > more edits. When the SbNN finally finishes catching up the edits and tries to > become active, it will crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org