[ https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845512#comment-17845512 ]
ASF GitHub Bot commented on HDFS-13603: --------------------------------------- CloudZY commented on code in PR #6774: URL: https://github.com/apache/hadoop/pull/6774#discussion_r1597249343 ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java: ########## @@ -601,18 +604,22 @@ public void run() { NameNode.LOG.error("Cannot warm up EDEKs.", e); throw e; } - try { - Thread.sleep(retryInterval); - } catch (InterruptedException ie) { - NameNode.LOG.info("EDEKCacheLoader interrupted during retry."); - break; + + if (!success) { + try { + Thread.sleep(retryInterval); + } catch (InterruptedException ie) { + NameNode.LOG.info("EDEKCacheLoader interrupted during retry."); + break; + } + retryCount++; } - sinceLastLog += retryInterval; Review Comment: It does not get updated since initial settings. Shall we add it back or remove its usages completely as Simba said? > Warmup NameNode EDEK thread retries continuously if there's an invalid key > --------------------------------------------------------------------------- > > Key: HDFS-13603 > URL: https://issues.apache.org/jira/browse/HDFS-13603 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode > Affects Versions: 2.8.0 > Reporter: Antony Jay > Priority: Major > Labels: pull-request-available > > https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to > pre-warm EDEK cache. > However this fails and retries continuously if key retrieval fails for one > encryption zone. In our usecase, we have temporarily removed keys for certain > encryption zones. Currently namenode and kms log is filled up with errors > related to background thread retrying warmup for ever . > The pre-warm thread should > * Continue to refresh other encryption zones even if it fails for one > * Should retry only if it fails for all encryption zones, which will be the > case when kms is down. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org