Kinga Marton created YUNIKORN-465: ------------------------------------- Summary: scheduler health check REST API Key: YUNIKORN-465 URL: https://issues.apache.org/jira/browse/YUNIKORN-465 Project: Apache YuniKorn Issue Type: Bug Reporter: Kinga Marton Assignee: Kinga Marton
We need to build a health check REST API for the scheduler This is needed for chaos monkey tests, the validation script can call the API to verify the scheduler state periodically We should leverage scheduler metrics to do the validation, things to validate like: # Negative resources on node/app/cluster # Consistency of the data, e.g sum of allocated resource of apps = allocated resource in the partition # critical errors logged in the metrics (things should not happen but happened) # ... -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org