[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2024-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868160#comment-17868160
 ] 

ASF GitHub Bot commented on YARN-11178:
---

dotzborro commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-2245967534

   This bug seems to have been solved by 
https://github.com/apache/hadoop/pull/5629 
(https://issues.apache.org/jira/browse/YARN-11489), after which the futures are 
held in `LinkedBlockingQueue`. Elements are taken from the list using `take` 
method, that is documented like following:
   
   > Retrieves and removes the head of this queue, waiting if necessary until 
an element becomes available.
   
   I confirmed in our environments that after patching 3.3.6 with this fix the 
CPU utilization is back to normal. I believe that 
https://issues.apache.org/jira/browse/YARN-11178 can be closed.




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in c

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2024-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854691#comment-17854691
 ] 

ASF GitHub Bot commented on YARN-11178:
---

pstrzelczak commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-2165145077

   As already mentioned, this issue make 1vcpu to be always occupied due to 
busy loop. The makes the adoption of 3.3 line difficult in production 
environments. Can it be fixed?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732335#comment-17732335
 ] 

ASF GitHub Bot commented on YARN-11178:
---

gp1314 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1590396294

   Thanks to LennonChin for his contributions and ideas, but why close this 
commit? I don't think the problem has been fixed. Can I continue the work?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
>   

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-06-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730391#comment-17730391
 ] 

ASF GitHub Bot commented on YARN-11178:
---

LennonChin closed pull request #4435: YARN-11178. Avoid CPU busy idling and 
resource wasting in DelegationTokenRenewerPoolTracker thread
URL: https://github.com/apache/hadoop/pull/4435




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
>  

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709969#comment-17709969
 ] 

ASF GitHub Bot commented on YARN-11178:
---

hadoop-yetus commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1501121315

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  35m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 44s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 57s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   1m 52s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  1s |  |  trunk passed  |
   | +1 :green_heart: |  spotbugs  |   6m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   8m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 41s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m  7s |  |  the patch passed  |
   | +1 :green_heart: |  xmllint  |   0m  0s |  |  No new issues.  |
   | +1 :green_heart: |  javadoc  |   2m 32s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   6m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 16s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 50s |  |  hadoop-yarn-common in the patch 
passed.  |
   | -1 :x: |  unit  | 100m 27s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/5/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 278m 47s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4435 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 18558f9c 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 84d4d555f62e29fdc19947869d02ffd33b9ac4c5 |
   | Default Java | Red Hat, Inc.-1.8.0_362-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/5/testReport/ |
   | Max. process+thread count | 943 (vs. ulimit of 5500) |
   | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: hadoop-yarn-project/hadoop-yarn |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/5/console |
   | versions | git=2.9.5 maven=3.6.3 spotbugs=4.2.2 xmllint=20901 |
   | Powered by | Apache Ye

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709951#comment-17709951
 ] 

ASF GitHub Bot commented on YARN-11178:
---

Hexiaoqiao commented on code in PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#discussion_r1161241461


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:
##
@@ -819,6 +819,15 @@ public static boolean isAclEnabled(Configuration conf) {
   public static final int DEFAULT_RM_DT_RENEWER_THREAD_RETRY_MAX_ATTEMPTS =
   10;
 
+  /**
+   * The setting is used to control delegation token renewer thread perform 
backoff
+   * waiting when there are no renewer event futures to avoid CPU busy idling.
+   */
+  public static final String RM_DT_RENEWER_THREAD_IDLE_BACKOFF_MS =
+  RM_PREFIX + "delegation-token-renewer.thread-idle-backoff-ms";
+  public static final long DEFAULT_RM_DT_RENEWER_THREAD_IDLE_BACKOFF_MS =
+  TimeUnit.SECONDS.toMillis(3); // 3 Seconds

Review Comment:
   Just suggest to change `TimeUnit.SECONDS.toMillis(3)` to 18 directly to 
keep same dimension as Key/Value name says (MS/ms) and avoid ambiguity for end 
user.



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml:
##
@@ -1148,6 +1148,16 @@
 60s
   
 
+  
+
+  The setting is used to control each RM DelegationTokenRenewer thread 
perform backoff
+  waiting when there are no renewer event futures to avoid CPU busy idling.
+  the default value is 3 seconds.
+
+
yarn.resourcemanager.delegation-token-renewer.thread-idle-backoff-ms
+3s

Review Comment:
   +1 as mentioned above. We should keep the same dimension for both key and 
value of configuration.





> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-04-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709672#comment-17709672
 ] 

ASF GitHub Bot commented on YARN-11178:
---

gp1314 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1500270398

   TestDelegationTokenRenewer. TestTokenThreadTimeout need 
RM_DT_RENEWER_THREAD_IDLE_BACKOFF_MS is set to less than 
RM_DT_RENEWER_THREAD_TIMEOUT




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
>

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-04-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709574#comment-17709574
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1499906514

   > I had the same problem. hadoop 3.3.5 was released and had the same problem
   
   We will continue to follow up this pr. @LennonChin Can we rebuild this pr?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.i

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-04-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707794#comment-17707794
 ] 

ASF GitHub Bot commented on YARN-11178:
---

gp1314 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1493809925

   I had the same problem. hadoop 3.3.5 was released and had the same problem




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
>   

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-02-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694275#comment-17694275
 ] 

ASF GitHub Bot commented on YARN-11178:
---

LennonChin commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1447479319

   > @LennonChin Is this junit test error related to this pr?
   
   Looks like it is, I'll check it out.
   I'm a little busy lately, sorry...




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
>   

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-02-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691531#comment-17691531
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1438320061

   @LennonChin Is this junit test error related to this pr? 




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "t

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-02-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17683638#comment-17683638
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1414736328

   @LennonChin Thank you for your contribution! Is the unit test error caused 
by this pr?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewe

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-02-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17683636#comment-17683636
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on code in PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#discussion_r1095306416


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java:
##
@@ -996,6 +996,22 @@ public void run() {
 @Override
 public void run() {
   while (true) {
+if (futures.isEmpty()) {
+  synchronized (this) {
+try {
+  // waiting for tokenRenewerThreadTimeout milliseconds
+  long waitingTimeMs = Math.min(1, Math.max(500, 
tokenRenewerThreadTimeout));

Review Comment:
   > @slfan1989 what do you suggest increasing them to?
   
   Sorry for the late reply, I hope the configuration can be added, not 
hardcoded.





> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThre

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-01-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655368#comment-17655368
 ] 

ASF GitHub Bot commented on YARN-11178:
---

hadoop-yetus commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1373425771

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  8s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  10m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   9m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 45s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m  1s | 
[/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/4/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-yarn-server-resourcemanager in trunk failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   2m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 39s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   9m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   8m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 35s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 56s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 56s | 
[/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/4/artifact/out/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-yarn-server-resourcemanager in the patch failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   2m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 15s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 12s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 43s |  |  hadoop-yarn-common in the patch 
passed.  |
   | -1 :x: |  unit  | 102m 26s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/4/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicens

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-01-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654922#comment-17654922
 ] 

ASF GitHub Bot commented on YARN-11178:
---

hadoop-yetus commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1372103081

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 58s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   8m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 48s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 20s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m  2s | 
[/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/3/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-yarn-server-resourcemanager in trunk failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   2m 47s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 43s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   9m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   8m 34s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 39s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m  1s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 56s | 
[/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/3/artifact/out/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-yarn-server-resourcemanager in the patch failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   2m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 35s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 11s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 44s |  |  hadoop-yarn-common in the patch 
passed.  |
   | -1 :x: |  unit  |  99m 49s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4435/3/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicens

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-01-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654817#comment-17654817
 ] 

ASF GitHub Bot commented on YARN-11178:
---

LennonChin commented on code in PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#discussion_r1062171496


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java:
##
@@ -996,6 +996,22 @@ public void run() {
 @Override
 public void run() {
   while (true) {
+if (futures.isEmpty()) {
+  synchronized (this) {
+try {
+  // waiting for tokenRenewerThreadTimeout milliseconds
+  long waitingTimeMs = Math.min(1, Math.max(500, 
tokenRenewerThreadTimeout));

Review Comment:
   > @slfan1989 what do you suggest increasing them to?
   
   I think he means to add a configuration to control the waiting time.





> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, 

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-01-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654812#comment-17654812
 ] 

ASF GitHub Bot commented on YARN-11178:
---

LennonChin commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1371858309

   I have made two commits to perform your suggestions,  @slfan1989 please cc




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
>   

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2023-01-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653797#comment-17653797
 ] 

ASF GitHub Bot commented on YARN-11178:
---

dineshchitlangia commented on code in PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#discussion_r1060289109


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java:
##
@@ -996,6 +996,22 @@ public void run() {
 @Override
 public void run() {
   while (true) {
+if (futures.isEmpty()) {
+  synchronized (this) {
+try {
+  // waiting for tokenRenewerThreadTimeout milliseconds
+  long waitingTimeMs = Math.min(1, Math.max(500, 
tokenRenewerThreadTimeout));

Review Comment:
   @slfan1989 what do you suggest increasing them to?





> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
>

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651263#comment-17651263
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on code in PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#discussion_r1055408239


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java:
##
@@ -1023,6 +1039,16 @@ public void run() {
 LOG.info("Problem in submitting renew tasks in token renewer "
 + "thread.", e);
   }
+  if (future.isDone() || future.isCancelled()) {
+try {
+  futures.remove(evt);
+  LOG.info("Removed done or cancelled renewer tasks of {}" +
+  " in token renewer thread.", evt.getApplicationId());

Review Comment:
   indentation, Usually 5 characters.





> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECON

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651261#comment-17651261
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on code in PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#discussion_r1055407270


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java:
##
@@ -996,6 +996,22 @@ public void run() {
 @Override
 public void run() {
   while (true) {
+if (futures.isEmpty()) {
+  synchronized (this) {
+try {
+  // waiting for tokenRenewerThreadTimeout milliseconds
+  long waitingTimeMs = Math.min(1, Math.max(500, 
tokenRenewerThreadTimeout));

Review Comment:
   500, 1 Should we increase the configuration?





> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cance

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651248#comment-17651248
 ] 

ASF GitHub Bot commented on YARN-11178:
---

slfan1989 commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1362744741

   @LennonChin can we rebase again?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> 

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-11-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640327#comment-17640327
 ] 

ASF GitHub Bot commented on YARN-11178:
---

cainbit commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1329946107

   > @slfan1989 @dineshchitlangia @brumi1024 @9uapaw @ashutoshcipher Could you 
please to help review the code?
   
   This issue makes version 3.3.4 unable to run directly on the production 
environment without a manual patching。
   Could I know the current review status of this PR and whether it can be 
incorporated into version 3.3.5?




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.sche

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-11-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640322#comment-17640322
 ] 

ASF GitHub Bot commented on YARN-11178:
---

cainbit commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1329936324

   > > @slfan1989
   > > I'm using Hadoop 3.3.4 with kerberos, and the issue is happened, the 
DelegationTokenRenewer use 100% CPU.
   > > I use jstack to check threads and find the DelegationTokenRenewer thread 
cause this issue, like @LennonChin said.
   > 
   > The issue still exists but the PR is not being merged, I am not quite sure 
why it is not merged, you can manually merge this patch to solve this problem.
   > 
   > After merging the patch, we have been running in the production 
environment for half a year, and no problems have been found so far.
   
   Thank you for your work on this issue, I had added the empty "futures" 
checking code, it works well.




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3, 3.3.4
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (Timeou

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639793#comment-17639793
 ] 

ASF GitHub Bot commented on YARN-11178:
---

LennonChin commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1328565915

   > @slfan1989 
   > 
   > 
   > 
   > I'm using Hadoop 3.3.4 with kerberos, and the issue is happened, the 
DelegationTokenRenewer use 100% CPU.
   > 
   > I use jstack to check threads and find the DelegationTokenRenewer thread 
cause this issue, like @LennonChin said.
   
   The issue still exists but the PR is not being merged, I am not quite sure 
why it is not merged, you can manually merge this patch to solve this problem.
   




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>

[jira] [Commented] (YARN-11178) Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

2022-11-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639572#comment-17639572
 ] 

ASF GitHub Bot commented on YARN-11178:
---

cainbit commented on PR #4435:
URL: https://github.com/apache/hadoop/pull/4435#issuecomment-1328167253

   I can reproduce the issue and I would like to know the current status of 
this PR. Thanks.




> Avoid CPU busy idling and resource wasting in 
> DelegationTokenRenewerPoolTracker thread
> --
>
> Key: YARN-11178
> URL: https://issues.apache.org/jira/browse/YARN-11178
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security
>Affects Versions: 3.3.1, 3.3.2, 3.3.3
> Environment: Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and 
> Spark 3.0.3
>Reporter: Lennon Chin
>Priority: Minor
>  Labels: pull-request-available
> Attachments: YARN-11178.CPU idling busy 100% before optimized.png, 
> YARN-11178.CPU normal after optimized.png, YARN-11178.CPU profile for idling 
> busy 100% before optimized.html, YARN-11178.CPU profile for idling busy 100% 
> before optimized.png, YARN-11178.CPU profile for normal after optimized.html, 
> YARN-11178.CPU profile for normal after optimized.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in 
> empty poll iterate when there is no delegation token renewer event task in 
> the futures map:
> {code:java}
> // 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
> @Override
> public void run() {
>   // this while true loop is busy when the `futures` is empty
>   while (true) {
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
> + "thread for {}",
> tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
>   }
> }
>   } catch (Exception e) {
> LOG.info("Problem in submitting renew tasks in token renewer "
> + "thread.", e);
>   }
> }
>   }
> }{code}
> A better way to avoid CPU idling is waiting for some time when the `futures` 
> map is empty, and when the renewer task done or cancelled, we should remove 
> the task future in `futures` map to avoid memory leak:
> {code:java}
> @Override
> public void run() {
>   while (true) {
> // waiting for some time when futures map is empty
> if (futures.isEmpty()) {
>   synchronized (this) {
> try {
>   // waiting for tokenRenewerThreadTimeout milliseconds
>   long waitingTimeMs = Math.min(1, Math.max(500, 
> tokenRenewerThreadTimeout));
>   LOG.info("Delegation token renewer pool is empty, waiting for {} 
> ms.", waitingTimeMs);
>   wait(waitingTimeMs);
> } catch (InterruptedException e) {
>   LOG.warn("Delegation token renewer pool tracker waiting interrupt 
> occurred.");
>   Thread.currentThread().interrupt();
> }
>   }
>   if (futures.isEmpty()) {
> continue;
>   }
> }
> for (Map.Entry> entry : futures
> .entrySet()) {
>   DelegationTokenRenewerEvent evt = entry.getKey();
>   Future future = entry.getValue();
>   try {
> future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
>   } catch (TimeoutException e) {
> // Cancel thread and retry the same event in case of timeout
> if (future != null && !future.isDone() && !future.isCancelled()) {
>   future.cancel(true);
>   futures.remove(evt);
>   if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
> renewalTimer.schedule(
> getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
> tokenRenewerThreadRetryInterval);
>   } else {
> LOG.info(
> "Exhausted max retry attempts {} in token renewer "
>