wchevreuil commented on a change in pull request #2255:
URL: https://github.com/apache/hbase/pull/2255#discussion_r475670615



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
##########
@@ -587,12 +617,25 @@ private void initialize() {
 
   @Override
   public void startup() {
-    // mark we are running now
-    this.sourceRunning = true;
-    initThread = new Thread(this::initialize);
-    Threads.setDaemonThreadRunning(initThread,
-      Thread.currentThread().getName() + ".replicationSource," + this.queueId,
-      this::uncaughtException);
+    //Flag that signalizes uncaught error happening while starting up the 
source
+    // and a retry should be attempted
+    AtomicBoolean retryStartup = new AtomicBoolean(false);
+    retryStartup.set(true);
+    do {
+      if(retryStartup.get()) {
+        retryStartup.set(false);
+        // mark we are running now
+        this.sourceRunning = true;
+        initThread = new Thread(this::initialize);
+        Threads.setDaemonThreadRunning(initThread,
+          Thread.currentThread().getName() + ".replicationSource," + 
this.queueId,
+          (t,e) -> {
+          sourceRunning = false;
+          uncaughtException(t, e, null, null);
+          retryStartup.set(true);
+        });
+      }
+    } while (!this.sourceRunning);

Review comment:
       I had a second thought on this here, we can't simply re-use this 
boolean, because in case of failure, we risk reach this point before the 
exception handler has updated it to false. I'm bringing back the original 
_startupOngoing_ in the next commit,




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to