[ https://issues.apache.org/jira/browse/HDFS-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781318#comment-16781318 ]
Ekanth Sethuramalingam commented on HDFS-14317: ----------------------------------------------- {quote}That's not true, they both accept time units. You can do something like "10ms" and it will parse it properly. This is only in Hadoop 3+ {quote} Thanks [~xkrogen] for pointing this out. However, as I looked deeper in the code, the value is converted to seconds which will lose the precision. {code:java} logRollPeriodMs = conf.getTimeDuration( DFSConfigKeys.DFS_HA_LOGROLL_PERIOD_KEY, DFSConfigKeys.DFS_HA_LOGROLL_PERIOD_DEFAULT, TimeUnit.SECONDS) * 1000; {code} and {code:java} sleepTimeMs = conf.getTimeDuration( DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_KEY, DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_DEFAULT, TimeUnit.SECONDS) * 1000; {code} I guess I'll leave it as it is. I'll upload a new patch in a bit. > Standby does not trigger edit log rolling when in-progress edit log tailing > is enabled > -------------------------------------------------------------------------------------- > > Key: HDFS-14317 > URL: https://issues.apache.org/jira/browse/HDFS-14317 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.9.0, 3.0.0 > Reporter: Ekanth Sethuramalingam > Assignee: Ekanth Sethuramalingam > Priority: Critical > Attachments: HDFS-14317.001.patch > > > The standby uses the following method to check if it is time to trigger edit > log rolling on active. > {code} > /** > * @return true if the configured log roll period has elapsed. > */ > private boolean tooLongSinceLastLoad() { > return logRollPeriodMs >= 0 && > (monotonicNow() - lastLoadTimeMs) > logRollPeriodMs ; > } > {code} > In doTailEdits(), lastLoadTimeMs is updated when standby is able to > successfully tail any edits > {code} > if (editsLoaded > 0) { > lastLoadTimeMs = monotonicNow(); > } > {code} > The default configuration for {{dfs.ha.log-roll.period}} is 120 seconds and > {{dfs.ha.tail-edits.period}} is 60 seconds. With in-progress edit log tailing > enabled, tooLongSinceLastLoad() will almost never return true resulting in > edit logs not rolled for a long time until this configuration > {{dfs.namenode.edit.log.autoroll.multiplier.threshold}} takes effect. > [In our deployment, this resulted in in-progress edit logs getting deleted. > The sequence of events is that standby was able to checkpoint twice while the > in-progress edit log was growing on active. When the > NNStorageRetentionManager decided to cleanup old checkpoints and edit logs, > it cleaned up the in-progress edit log from active and QJM (as the txnid on > in-progress edit log was older than the 2 most recent checkpoints) resulting > in irrecoverably losing a few minutes worth of metadata]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org