jadami10 commented on code in PR #11347:
URL: https://github.com/apache/pinot/pull/11347#discussion_r1294538817


##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java:
##########
@@ -555,6 +555,15 @@ public static class Server {
     // Use 10 seconds by default so high volume stream are able to catch up.
     // This is also the default in the case a user misconfigures this by 
setting to <= 0.
     public static final int DEFAULT_STARTUP_REALTIME_MIN_FRESHNESS_MS = 10000;
+    // The timeouts above determine how long servers will poll their status 
before giving up.
+    // This configuration determines what we do when we give up. By default, 
we will mark the
+    // server as healthy and start the query server. If this is set to true, 
we instead throw
+    // an exception and exit the server. This is useful if you want to ensure 
that the server
+    // is always fully ready before accepting queries. But note that this can 
cause the server
+    // to never be healthy if there is some reason that it can never reach a 
GOOD status.
+    public static final String CONFIG_OF_EXIT_SERVER_ON_INCOMPLETE_STARTUP =
+        "pinot.server.starter.exitServerOnStartupStatusFailure";

Review Comment:
   done



##########
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/BaseServerStarter.java:
##########
@@ -478,8 +478,9 @@ private void startupServiceStatusCheck(long endTimeMs) {
     long checkIntervalMs = 
_serverConf.getProperty(Server.CONFIG_OF_STARTUP_SERVICE_STATUS_CHECK_INTERVAL_MS,
         Server.DEFAULT_STARTUP_SERVICE_STATUS_CHECK_INTERVAL_MS);
 
+    Status serviceStatus = ServiceStatus.getServiceStatus(_instanceId);

Review Comment:
   I did it this way so we'd have the initial status in log message below if 
for some reason endTimeMs was immediate. but i guess that seems highly unlikely 
since it's added to current time, and null is valid in the log



##########
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/BaseServerStarter.java:
##########
@@ -501,6 +502,16 @@ private void startupServiceStatusCheck(long endTimeMs) {
       }
     }
 
+    boolean exitServerOnIncompleteStartup = 
_serverConf.getProperty(Server.CONFIG_OF_EXIT_SERVER_ON_INCOMPLETE_STARTUP,
+        Server.DEFAULT_EXIT_SERVER_ON_INCOMPLETE_STARTUP);
+    if (exitServerOnIncompleteStartup) {
+      String errorMessage = String.format("Service status %s has not turned 
GOOD within %dms: %s. Exiting server.",
+          serviceStatus, System.currentTimeMillis() - startTimeMs, 
ServiceStatus.getStatusDescription());
+      LOGGER.error(errorMessage);
+      // Stop the server so that it will be removed from the Helix cluster
+      stop();

Review Comment:
   that's fair



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to