mluvin-stripe opened a new issue, #17599:
URL: https://github.com/apache/pinot/issues/17599

   ## Problem
   When using the “pause ingestion based on resource utilization” feature 
([docs](https://docs.pinot.apache.org/operators/operating-pinot/pause-ingestion-based-on-resource-utilization)),
 upon restart, controllers initially don’t have their cache of server disk 
utilization information populated until the ResourceUtilizationChecker periodic 
task runs. There’s a config 
`controller.resource.utilization.checker.initial.delay` that we can set to zero 
seconds to kick off populating the cache immediately, but the controller could 
still start serving requests before the checker finishes populating the cache 
since the controller doesn’t wait for the checker to finish before marking 
itself as ready.
   
   This is a problem for minion-based offline segment generation 
([code](https://github.com/apache/pinot/blob/b4081d6003347020cc5e38eb3c60638c6aa8f1de/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java#L332-L337))
 and offline segment uploads (new feature proposed in 
https://github.com/apache/pinot/issues/17557), since the disk utilization check 
will return UNDETERMINED if the controller’s disk utilization cache isn’t yet 
populated – so the segment creation/upload is allowed to proceed, even if the 
disk threshold has already been breached.
   
   ## Solution
   I propose adding an opt-in config 
`controller.resource.utilization.checker.waitDuringStartup` that ensures the 
disk utilization cache is populated before marking the controller as ready. 
This way, the controller is immediately ready to correctly reject segment 
creation/upload requests after starting up.
   
   I was thinking of adding another serviceStatusCallback (like [this 
one](https://github.com/apache/pinot/blob/b4081d6003347020cc5e38eb3c60638c6aa8f1de/pinot-controller/src/main/java/org/apache/pinot/controller/BaseControllerStarter.java#L758))that
 checks if the disk utilization cache has been populated yet, and doesn’t 
return GOOD until it’s populated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to