[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-06-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851721#comment-17851721
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina commented on PR #6860:
URL: https://github.com/apache/hadoop/pull/6860#issuecomment-2145635848

   I've merged https://github.com/apache/hadoop/pull/6860. Could you add an 
empty commit on this PR so that the tests are run against it merged with the 
latest trunk.




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-06-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851710#comment-17851710
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina merged PR #6860:
URL: https://github.com/apache/hadoop/pull/6860




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-31 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851242#comment-17851242
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

hadoop-yetus commented on PR #6860:
URL: https://github.com/apache/hadoop/pull/6860#issuecomment-2143229394

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 47s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 36s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 34s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  18m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 30s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   1m 41s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 41s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 55s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 55s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m 32s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 51s |  |  hadoop-kms in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m  4s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 265m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6860/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6860 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 300504947fed 5.15.0-107-generic #117-Ubuntu SMP Fri Apr 26 
12:26:49 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f4d522279f51cd932e2c80c8ed7b1efc389dd8d0 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6860/1/testReport/ |
   | Max. process+thread count | 3134 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-common-project/hadoop-kms U: hadoop-common-project |
   | Console output | 
https://ci-h

[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-31 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851218#comment-17851218
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 opened a new pull request, #6860:
URL: https://github.com/apache/hadoop/pull/6860

   … queues for keys.
   
   
   JIRA = [HDFS-13603](https://issues.apache.org/jira/browse/HDFS-13603)
   ### Description of PR
   throw IOException at last if cache warmup fail for any keys, continue to 
warmup other keys.
   
   This is the changes in hadoop-common from PR 
https://github.com/apache/hadoop/pull/6774
   
   
   ### How was this patch tested?
   new unit test is added
   ```
   mvn test -Dtest=TestValueQueue
   [INFO] Running org.apache.hadoop.crypto.key.TestValueQueue
   [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
13.077 s - in org.apache.hadoop.crypto.key.TestValueQueue
   ```
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-31 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851206#comment-17851206
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina commented on PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#issuecomment-2142956485

   Could you please to carve out the changes in hadoop-common into a separate 
PR. I'm suspecting this may be what is causing the CI tests to hang and fail.
   
   Besides the CI tests, keeping the changes contained within a module is 
cleaner for maintainability and modularity.




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850502#comment-17850502
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1619472954


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestValueQueue.java:
##
@@ -111,6 +119,40 @@ public void testWarmUp() throws Exception {
 vq.shutdown();
   }
 
+  /**

Review Comment:
   [ERROR] 
/Users/sdzinama/dev/hadooptree/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestValueQueue.java:122:
 First sentence should end with a period. [JavadocStyle]
   





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850503#comment-17850503
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1619473416


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestValueQueue.java:
##
@@ -111,6 +119,40 @@ public void testWarmUp() throws Exception {
 vq.shutdown();
   }
 
+  /**
+   * Verifies that Queue is initialized (Warmed-up) for partial keys
+   */
+  @Test(timeout = 3)
+  public void testPartialWarmUp() throws Exception {
+MockFiller filler = new MockFiller();
+ValueQueue vq =
+new ValueQueue<>(10, 0.5f, 3, 1,
+SyncGenerationPolicy.ALL, filler);
+
+@SuppressWarnings("unchecked")
+LoadingCache> kq =
+(LoadingCache>)
+FieldUtils.getField(ValueQueue.class, "keyQueues", true).get(vq);
+
+LoadingCache> kqSpy = 
spy(kq);

Review Comment:
   [ERROR] 
/Users/sdzinama/dev/hadooptree/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestValueQueue.java:137:
 Line is longer than 100 characters (found 110). [LineLength]





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850487#comment-17850487
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1597113868


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key
*/
-  public void initializeQueuesForKeys(String... keyNames)
-  throws ExecutionException {
+  public void initializeQueuesForKeys(String... keyNames) throws IOException {
+int successfulInitializations = 0;
+ExecutionException lastException = null;
+
 for (String keyName : keyNames) {
-  keyQueues.get(keyName);
+  try {
+keyQueues.get(keyName);
+successfulInitializations++;
+  } catch (ExecutionException e) {
+lastException = e;
+  }
+}
+
+if (keyNames.length > 0 && successfulInitializations == 0) {
+  throw new IOException("Failed to initialize any queue for the provided 
keys.", lastException);

Review Comment:
   It seems you've made warm up a best effort operation. If so, there should be 
no need to through an exception here. Just logging a warning should be enough.





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849480#comment-17849480
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786474


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key

Review Comment:
   make sense, keeps the contract exception if any key isn't initialized
   updated the javadocc 





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849479#comment-17849479
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614787242


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key
*/
-  public void initializeQueuesForKeys(String... keyNames)
-  throws ExecutionException {
+  public void initializeQueuesForKeys(String... keyNames) throws IOException {
+int successfulInitializations = 0;
+ExecutionException lastException = null;
+
 for (String keyName : keyNames) {
-  keyQueues.get(keyName);
+  try {
+keyQueues.get(keyName);
+successfulInitializations++;
+  } catch (ExecutionException e) {
+lastException = e;
+  }
+}
+
+if (keyNames.length > 0 && successfulInitializations == 0) {
+  throw new IOException("Failed to initialize any queue for the provided 
keys.", lastException);

Review Comment:
   Update to throw IOException if any key isn't initialized
   
   So this is used to indicate warmup should be retried. 
   Since the return value is void of this func and it's caller funcs, would 
rather keep the IOException. Otherwise, it's hard to update the `success` flag 
in 
[.../hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java](https://github.com/apache/hadoop/pull/6774/files/10d763a5ff514541ef1eea11d70bf5173374a5d1#diff-092663652ffe33b10e51bfa062724d70e6334a9a14e64db35c9854805e09da14)





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849478#comment-17849478
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614787242


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key
*/
-  public void initializeQueuesForKeys(String... keyNames)
-  throws ExecutionException {
+  public void initializeQueuesForKeys(String... keyNames) throws IOException {
+int successfulInitializations = 0;
+ExecutionException lastException = null;
+
 for (String keyName : keyNames) {
-  keyQueues.get(keyName);
+  try {
+keyQueues.get(keyName);
+successfulInitializations++;
+  } catch (ExecutionException e) {
+lastException = e;
+  }
+}
+
+if (keyNames.length > 0 && successfulInitializations == 0) {
+  throw new IOException("Failed to initialize any queue for the provided 
keys.", lastException);

Review Comment:
   So this is used to indicate warmup should be retried. 
   Since the return value is void of this func and it's caller funcs, would 
rather keep the IOException. Otherwise, it's hard to update the `success` flag 
in 
[.../hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java](https://github.com/apache/hadoop/pull/6774/files/10d763a5ff514541ef1eea11d70bf5173374a5d1#diff-092663652ffe33b10e51bfa062724d70e6334a9a14e64db35c9854805e09da14)





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849477#comment-17849477
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786474


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key

Review Comment:
   ack, keeps the contract exception if any key isn't initialized
   updated the javadocc 





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849474#comment-17849474
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614787242


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key
*/
-  public void initializeQueuesForKeys(String... keyNames)
-  throws ExecutionException {
+  public void initializeQueuesForKeys(String... keyNames) throws IOException {
+int successfulInitializations = 0;
+ExecutionException lastException = null;
+
 for (String keyName : keyNames) {
-  keyQueues.get(keyName);
+  try {
+keyQueues.get(keyName);
+successfulInitializations++;
+  } catch (ExecutionException e) {
+lastException = e;
+  }
+}
+
+if (keyNames.length > 0 && successfulInitializations == 0) {
+  throw new IOException("Failed to initialize any queue for the provided 
keys.", lastException);

Review Comment:
   So this is used to indicate all key initialization failed and should be 
retried. 
   Since the return value is void of this func and it's caller funcs, keeping 
the IOException. Otherwise, it's hard to update the `success` flag in 
[.../hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java](https://github.com/apache/hadoop/pull/6774/files/10d763a5ff514541ef1eea11d70bf5173374a5d1#diff-092663652ffe33b10e51bfa062724d70e6334a9a14e64db35c9854805e09da14)



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -601,18 +604,22 @@ public void run() {
   NameNode.LOG.error("Cannot warm up EDEKs.", e);
   throw e;
 }
-try {
-  Thread.sleep(retryInterval);
-} catch (InterruptedException ie) {
-  NameNode.LOG.info("EDEKCacheLoader interrupted during retry.");
-  break;
+
+if (!success) {
+  try {
+Thread.sleep(retryInterval);
+  } catch (InterruptedException ie) {
+NameNode.LOG.info("EDEKCacheLoader interrupted during retry.");
+break;
+  }
+  retryCount++;
 }
-sinceLastLog += retryInterval;

Review Comment:
   removed. 





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849473#comment-17849473
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786685


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirEncryptionZoneOp.java:
##
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdfs.server.namenode;
+
+import java.io.IOException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.crypto.key.KeyProviderCryptoExtension;
+import org.apache.hadoop.hdfs.server.common.HdfsServerConstants.NamenodeRole;
+
+import org.junit.Test;
+
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.doThrow;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.times;
+import static org.mockito.Mockito.verify;
+
+public class TestFSDirEncryptionZoneOp {
+
+  @Test
+  public void testWarmUpEdekCacheRetries() throws IOException {
+NameNode.initMetrics(new Configuration(), NamenodeRole.NAMENODE);
+
+final int initialDelay = 100;
+final int retryInterval = 100;
+final int maxRetries = 2;
+
+KeyProviderCryptoExtension kpMock = mock(KeyProviderCryptoExtension.class);
+
+doThrow(new IOException())
+.doThrow(new IOException())
+.doAnswer(invocation -> null)
+.when(kpMock).warmUpEncryptedKeys(any());
+
+FSDirEncryptionZoneOp.EDEKCacheLoader loader =
+new FSDirEncryptionZoneOp.EDEKCacheLoader(new String[] {"edek1", 
"edek2"}, kpMock,
+initialDelay, retryInterval, maxRetries);
+
+loader.run();
+
+verify(kpMock, times(maxRetries)).warmUpEncryptedKeys(any());
+  }
+}

Review Comment:
   it's tricky to test it here. Added one ut in 
[TestValueQueue.java](https://github.com/apache/hadoop/pull/6774/commits/93e9cb1d8e2a06f27eef7f48ffcd1914a1b0b409#diff-91a27146ffcf47f09845b86f96aaa41123d46f5174dd5915e5e82b3466c3bc0f)





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849472#comment-17849472
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786507


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -580,15 +583,15 @@ public void run() {
   final int logCoolDown = 1; // periodically print error log (if any)
   int sinceLastLog = logCoolDown; // always print the first failure
   boolean success = false;
+  int retryCount = 0;
   IOException lastSeenIOE = null;
   long warmUpEDEKStartTime = monotonicNow();
-  while (true) {
+
+  while (!success && retryCount < maxRetries) {
 try {
   kp.warmUpEncryptedKeys(keyNames);
-  NameNode.LOG
-  .info("Successfully warmed up {} EDEKs.", keyNames.length);
+  NameNode.LOG.info("Successfully warmed up {} EDEKs.", 
keyNames.length);
   success = true;
-  break;
 } catch (IOException ioe) {
   lastSeenIOE = ioe;
   if (sinceLastLog >= logCoolDown) {

Review Comment:
   good catch, remove them since log are bounded now





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849470#comment-17849470
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786495


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -537,12 +537,12 @@ static boolean isInAnEZ(final FSDirectory fsd, final 
INodesInPath iip)
* then launch up a separate thread to warm them up.
*/
   static void warmUpEdekCache(final ExecutorService executor,
-  final FSDirectory fsd, final int delay, final int interval) {
+  final FSDirectory fsd, final int delay, final int interval, final int 
maxRetries) {

Review Comment:
   add the javadoc





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849469#comment-17849469
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786474


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key

Review Comment:
   updated. 





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849471#comment-17849471
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786507


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -580,15 +583,15 @@ public void run() {
   final int logCoolDown = 1; // periodically print error log (if any)
   int sinceLastLog = logCoolDown; // always print the first failure
   boolean success = false;
+  int retryCount = 0;
   IOException lastSeenIOE = null;
   long warmUpEDEKStartTime = monotonicNow();
-  while (true) {
+
+  while (!success && retryCount < maxRetries) {
 try {
   kp.warmUpEncryptedKeys(keyNames);
-  NameNode.LOG
-  .info("Successfully warmed up {} EDEKs.", keyNames.length);
+  NameNode.LOG.info("Successfully warmed up {} EDEKs.", 
keyNames.length);
   success = true;
-  break;
 } catch (IOException ioe) {
   lastSeenIOE = ioe;
   if (sinceLastLog >= logCoolDown) {

Review Comment:
   good catch, remove them 





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845512#comment-17845512
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

CloudZY commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1597249343


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -601,18 +604,22 @@ public void run() {
   NameNode.LOG.error("Cannot warm up EDEKs.", e);
   throw e;
 }
-try {
-  Thread.sleep(retryInterval);
-} catch (InterruptedException ie) {
-  NameNode.LOG.info("EDEKCacheLoader interrupted during retry.");
-  break;
+
+if (!success) {
+  try {
+Thread.sleep(retryInterval);
+  } catch (InterruptedException ie) {
+NameNode.LOG.info("EDEKCacheLoader interrupted during retry.");
+break;
+  }
+  retryCount++;
 }
-sinceLastLog += retryInterval;

Review Comment:
   It does not get updated since initial settings. Shall we add it back or 
remove its usages completely as Simba said?





> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845461#comment-17845461
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

simbadzina commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1597092684


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/ValueQueue.java:
##
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float 
lowWaterMark, long expiry,
* Initializes the Value Queues for the provided keys by calling the
* fill Method with "numInitValues" values
* @param keyNames Array of key Names
-   * @throws ExecutionException executionException.
+   * @throws IOException if no successful initialization for any key

Review Comment:
   The wording here is confusing. One way to read this is if any key. fails to 
initialize, then an except will be thrown. But IIUC an exception will be thrown 
if all keys fail to initialize.



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -537,12 +537,12 @@ static boolean isInAnEZ(final FSDirectory fsd, final 
INodesInPath iip)
* then launch up a separate thread to warm them up.
*/
   static void warmUpEdekCache(final ExecutorService executor,
-  final FSDirectory fsd, final int delay, final int interval) {
+  final FSDirectory fsd, final int delay, final int interval, final int 
maxRetries) {

Review Comment:
   Can you edit a comment in the function documentation to indicate that the 
warm up is best effort.



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##
@@ -580,15 +583,15 @@ public void run() {
   final int logCoolDown = 1; // periodically print error log (if any)
   int sinceLastLog = logCoolDown; // always print the first failure
   boolean success = false;
+  int retryCount = 0;
   IOException lastSeenIOE = null;
   long warmUpEDEKStartTime = monotonicNow();
-  while (true) {
+
+  while (!success && retryCount < maxRetries) {
 try {
   kp.warmUpEncryptedKeys(keyNames);
-  NameNode.LOG
-  .info("Successfully warmed up {} EDEKs.", keyNames.length);
+  NameNode.LOG.info("Successfully warmed up {} EDEKs.", 
keyNames.length);
   success = true;
-  break;
 } catch (IOException ioe) {
   lastSeenIOE = ioe;
   if (sinceLastLog >= logCoolDown) {

Review Comment:
   `sinceLastLog` is no longer really used now. You can just print the failure 
since the retry count is limited.



##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirEncryptionZoneOp.java:
##
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdfs.server.namenode;
+
+import java.io.IOException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.crypto.key.KeyProviderCryptoExtension;
+import org.apache.hadoop.hdfs.server.common.HdfsServerConstants.NamenodeRole;
+
+import org.junit.Test;
+
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.doThrow;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.times;
+import static org.mockito.Mockito.verify;
+
+public class TestFSDirEncryptionZoneOp {
+
+  @Test
+  public void testWarmUpEdekCacheRetries() throws IOException {
+NameNode.initMetrics(new Configuration(), NamenodeRole.NAMENODE);
+
+final int initialDelay = 100;
+final int retryInterval = 100;
+final int maxRetries = 2;
+
+KeyProviderCryptoExtension kpMock = mock(KeyProviderCryptoExtension.class);
+
+doThrow(new IOException())
+.doThrow(new IOException())
+.doAnswer(invocation -> null)
+.when(kpMock).warmUpEncryptedKeys(any());
+
+FSDirEncryptionZoneOp.EDEKCacheLoader loader =
+new FSDirEncryptionZoneOp.EDEKCacheLoader(new String[] {"edek1", 
"edek2"}, kpMoc

[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844782#comment-17844782
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

hadoop-yetus commented on PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#issuecomment-2101381146

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 02s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 01s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 18s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  92m 35s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  41m 14s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 25s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 36s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/3/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  15m 58s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 179m 02s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  12m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  37m 59s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  37m 59s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   6m 14s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 50s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/3/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  16m 21s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 188m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 568m 35s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6774 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | MINGW64_NT-10.0-17763 144c7ee9888d 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 10d763a5ff514541ef1eea11d70bf5173374a5d1 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/3/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-common-project/hadoop-kms hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/3/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for cer

[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841777#comment-17841777
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

hadoop-yetus commented on PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#issuecomment-2081665080

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 02s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 01s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   6m 23s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  | 130m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  61m 01s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   9m 30s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   6m 49s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/2/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  23m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 256m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  18m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  56m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  56m 38s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   9m 10s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   6m 55s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/2/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  22m 55s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 263m 03s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  asflicense  |   8m 50s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/2/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 815m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6774 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | MINGW64_NT-10.0-17763 5f2bdf72e508 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / f0e0386cb4e59ba263e6254215945b971a7d1bd0 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/2/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-common-project/hadoop-kms hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/2/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retr

[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841436#comment-17841436
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

hadoop-yetus commented on PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#issuecomment-2080426655

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 01s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 16s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  91m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  40m 45s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 01s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 26s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/1/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  15m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 172m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 18s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  12m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  38m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  38m 15s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   6m 12s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 35s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/1/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  15m 13s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 183m 04s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  asflicense  |   5m 52s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/1/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 556m 36s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6774 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | MINGW64_NT-10.0-17763 26e5efede4c6 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / f0e0386cb4e59ba263e6254215945b971a7d1bd0 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/1/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-common-project/hadoop-kms hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6774/1/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>  Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retr

[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841432#comment-17841432
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

hadoop-yetus commented on PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#issuecomment-2080423727

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 52s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  37m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  18m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 43s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 57s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   6m 53s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m 33s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  18m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 37s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 53s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m  1s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   7m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  41m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m  2s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 50s |  |  hadoop-kms in the patch passed. 
 |
   | -1 :x: |  unit  | 263m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6774/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  asflicense  |   1m  8s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6774/1/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 549m 30s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6774/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6774 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 50f4cb272f31 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f0e0386cb4e59ba263e6254215945b971a7d1bd0 |
   | Default Java | Private Bui

[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2024-04-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841384#comment-17841384
 ] 

ASF GitHub Bot commented on HDFS-13603:
---

yzhang559 opened a new pull request, #6774:
URL: https://github.com/apache/hadoop/pull/6774

   
   
   
   ### Description of PR
   JIRA = HDFS-13603
   The ekek cache warm up thread should not fail the whole warmup of other keys 
if an invalid key is encountered. 
   We have observed infinite retries to KMS if one of Encryption Key is not 
available. 
   
   Change it to
   - Only throw IOException if cache warmup fail for all keys, continue to 
warmup other keys. 
   - Should retry only if it fails for all keys, and add a config for the retry 
limit. 
   
   
   ### How was this patch tested?
   Added unit test TestFSDirEncryptionZoneOp for retry behavior
   
   Related unit tests 
   ```
   mvn test 
-Dtest=TestEncryptionZones,TestEncryptionZonesWithKMS,TestFSDirEncryptionZoneOp
   
   [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
   [INFO] Tests run: 44, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
137.217 s - in org.apache.hadoop.hdfs.TestEncryptionZones
   [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
   [INFO] Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
187.815 s - in org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
   [INFO] Running 
org.apache.hadoop.hdfs.server.namenode.TestFSDirEncryptionZoneOp
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.331 
s - in org.apache.hadoop.hdfs.server.namenode.TestFSDirEncryptionZoneOp
   
   mvn test -Dtest=TestValueQueue
   [INFO] Running org.apache.hadoop.crypto.key.TestValueQueue
   [INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
11.893 s - in org.apache.hadoop.crypto.key.TestValueQueue
   
   ```
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation? NA
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? NA
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files? NA
   
   




> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2019-08-07 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902248#comment-16902248
 ] 

Wei-Chiu Chuang commented on HDFS-13603:


We see this in our internal tests as well. I might spend some time to dig into 
this when I get a chance.

> Warmup NameNode EDEK thread retries continuously if there's an invalid key 
> ---
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.8.0
>Reporter: Antony Jay
>Priority: Major
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to 
> pre-warm EDEK cache. 
> However this fails and retries continuously if key retrieval fails for one 
> encryption zone. In our usecase, we have temporarily removed keys for certain 
> encryption zones.  Currently namenode and kms log is filled up with errors 
> related to background thread retrying warmup for ever .
> The pre-warm thread should
>  * Continue to refresh other encryption zones even if it fails for one
>  * Should retry only if it fails for all encryption zones, which will be the 
> case when kms is down.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13603) Warmup NameNode EDEK thread retries continuously if there's an invalid key

2018-05-23 Thread Antony Jay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487460#comment-16487460
 ] 

Antony Jay commented on HDFS-13603:
---

Exception stack trace

 

[2018-05-23 14:20:44,952] WARN [http-16000-11] (server.KMS) -hdfs- User 
hdfs/node0032.hostname.com@domain (auth:KERBEROS) request GET 
https://node0038.hostname.com:16000/kms/v1/key/encrypt_keyname/_eek?num_keys=150&eek_op=generate
 caused exception.
java.io.IOException: 
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.NullPointerException: No KeyVersion exists for key 'encrypt_keyname'
 at 
org.apache.hadoop.crypto.key.kms.server.KMS.generateEncryptedKeys(KMS.java:517)
 at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
 at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
 at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
 at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
 at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
 at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
org.apache.hadoop.crypto.key.kms.server.KMSMDCFilter.doFilter(KMSMDCFilter.java:84)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:631)
 at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:301)
 at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:579)
 at 
org.apache.hadoop.crypto.key.kms.server.KMSAuthenticationFilter.doFilter(KMSAuthenticationFilter.java:130)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: 
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.NullPointerException: No KeyVersion exists for key 'encrypt_keyname'
 at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:289)
 at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
 at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
 at 
com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:132)
 at 
com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2381)
 at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2351)
 at 
com.google.common.cache.LocalCache$Segment.lockedGe