[ 
https://issues.apache.org/jira/browse/GEODE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237561#comment-17237561
 ] 

ASF GitHub Bot commented on GEODE-8623:
---------------------------------------

Bill commented on a change in pull request #5743:
URL: https://github.com/apache/geode/pull/5743#discussion_r528897228



##########
File path: geode-common/src/main/java/org/apache/geode/internal/Retry.java
##########
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional 
information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software 
distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 
KIND, either express
+ * or implied. See the License for the specific language governing permissions 
and limitations under
+ * the License.
+ */
+package org.apache.geode.internal;
+
+import static java.util.concurrent.TimeUnit.NANOSECONDS;
+
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.function.Predicate;
+import java.util.function.Supplier;
+
+import org.apache.geode.annotations.VisibleForTesting;
+
+/**
+ * Utility class for retrying operations.
+ */
+public class Retry {
+
+  interface Timer {
+    long nanoTime();
+
+    void sleep(long sleepTimeInNano) throws InterruptedException;
+  }
+
+  static class SteadyTimer implements Timer {
+    @Override
+    public long nanoTime() {
+      return System.nanoTime();
+    }
+
+    @Override
+    public void sleep(long sleepTimeInNano) throws InterruptedException {
+      long millis = NANOSECONDS.toMillis(sleepTimeInNano);
+      // avoid throwing IllegalArgumentException
+      if (millis > 0) {
+        Thread.sleep(millis);
+      }
+    }
+  }
+
+  private static final SteadyTimer steadyClock = new SteadyTimer();
+
+  /**
+   * Try the supplier function until the predicate is true or timeout occurs.
+   *
+   * @param timeout to retry for
+   * @param timeoutUnit the unit for timeout
+   * @param interval time between each try
+   * @param intervalUnit the unit for interval
+   * @param supplier to execute until predicate is true or times out
+   * @param predicate to test for retry
+   * @param <T> type of return value
+   * @return value from supplier after it passes predicate or times out.
+   */
+  public static <T> T tryFor(long timeout, TimeUnit timeoutUnit,
+      long interval, TimeUnit intervalUnit,
+      Supplier<T> supplier,
+      Predicate<T> predicate) throws TimeoutException, InterruptedException {
+    return tryFor(timeout, timeoutUnit, interval, intervalUnit, supplier, 
predicate, steadyClock);
+  }
+
+  @VisibleForTesting
+  static <T> T tryFor(long timeout, TimeUnit timeoutUnit,
+      long interval, TimeUnit intervalUnit,
+      Supplier<T> supplier,
+      Predicate<T> predicate,
+      Timer timer) throws TimeoutException, InterruptedException {
+    long until = timer.nanoTime() + NANOSECONDS.convert(timeout, timeoutUnit);
+    long intervalNano = NANOSECONDS.convert(interval, intervalUnit);
+
+    T value;
+    for (;;) {
+      value = supplier.get();
+      if (predicate.test(value)) {
+        return value;
+      } else {
+        // if there is still more time left after we sleep for interval 
period, then sleep and retry
+        // otherwise break out and throw TimeoutException
+        if ((timer.nanoTime() + intervalNano) < until) {

Review comment:
       I think a user of the `Retry` class would (rightly) expect `tryFor()` to 
keep trying if `timeout` has not yet been reached.
   
   Computing `sleepNanos` gives good results when `intervalUnit` is greater 
than (`timeoutUnit` - supplier time - predicate time). Without `sleepNanos`, 
that scenario results in _no retries at all_ which I think most users of the 
`Retry` class would find counterintuitive.
   
   Failing to compute `sleepNanos` introduces an error. That error is directly 
proportional to `interval` and the relative error (error / `timeout`) grows as 
`interval` grows relative to `timeout`.
   
   Computing `sleepNanos` is simple and cheap. It gives better accuracy and 
meets user expectations in all cases. Unless you foresee some problem computing 
`sleepNanos` I still recommend doing it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Timing between DNS and Geode startup can result in permanent unknown host 
> exceptions.
> -------------------------------------------------------------------------------------
>
>                 Key: GEODE-8623
>                 URL: https://issues.apache.org/jira/browse/GEODE-8623
>             Project: Geode
>          Issue Type: Bug
>    Affects Versions: 1.9.0, 1.9.1, 1.10.0, 1.9.2, 1.11.0, 1.12.0, 1.13.0, 
> 1.14.0, 1.13.1
>            Reporter: Jacob Barrett
>            Priority: Minor
>              Labels: pull-request-available
>
> In a managed environment were local host name DNS entries and the startup of 
> Geode happen concurrently it is possible for Geode to fail name resolution in 
> the local hostname caching. If it fails to resolve the local hostname when 
> loading the caching utility class then any service dependent on this name 
> will fail without chance for recovery.
> {code}
> [error 2020/09/30 19:50:21.644 UTC <main> tid=0x1] Jmx manager could not be 
> started because java.net.UnknownHostException
> org.apache.geode.management.ManagementException: java.net.UnknownHostException
>       at 
> org.apache.geode.management.internal.ManagementAgent.startAgent(ManagementAgent.java:133)
>       at 
> org.apache.geode.management.internal.SystemManagementService.startManager(SystemManagementService.java:432)
>       at 
> org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheCreation(ManagementAdapter.java:181)
>       at 
> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:127)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2063)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1239)
>       at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:219)
>       at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:171)
>       at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
>       at 
> org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
>       at 
> org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:887)
>       at 
> org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:803)
>       at 
> org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:732)
>       at 
> org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:251)
> Caused by: java.net.UnknownHostException
>       at 
> org.apache.geode.internal.net.SocketCreator.getLocalHost(SocketCreator.java:285)
>       at 
> org.apache.geode.management.internal.ManagementAgent.configureAndStart(ManagementAgent.java:310)
>       at 
> org.apache.geode.management.internal.ManagementAgent.startAgent(ManagementAgent.java:131)
>       ... 14 more
> [error 2020/09/30 19:50:21.724 UTC <main> tid=0x1] 
> org.apache.geode.management.ManagementException: java.net.UnknownHostException
> Exception in thread "main" org.apache.geode.management.ManagementException: 
> java.net.UnknownHostException
>       at 
> org.apache.geode.management.internal.ManagementAgent.startAgent(ManagementAgent.java:133)
>       at 
> org.apache.geode.management.internal.SystemManagementService.startManager(SystemManagementService.java:432)
>       at 
> org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheCreation(ManagementAdapter.java:181)
>       at 
> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:127)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2063)
>       at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1239)
>       at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:219)
>       at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:171)
>       at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
>       at 
> org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
>       at 
> org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:887)
>       at 
> org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:803)
>       at 
> org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:732)
>       at 
> org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:251)
> Caused by: java.net.UnknownHostException
>       at 
> org.apache.geode.internal.net.SocketCreator.getLocalHost(SocketCreator.java:285)
>       at 
> org.apache.geode.management.internal.ManagementAgent.configureAndStart(ManagementAgent.java:310)
>       at 
> org.apache.geode.management.internal.ManagementAgent.startAgent(ManagementAgent.java:131)
>       ... 14 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to