Kinga Marton created YUNIKORN-465:
-------------------------------------

             Summary: scheduler health check REST API
                 Key: YUNIKORN-465
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-465
             Project: Apache YuniKorn
          Issue Type: Bug
            Reporter: Kinga Marton
            Assignee: Kinga Marton


We need to build a health check REST API for the scheduler
This is needed for chaos monkey tests, the validation script can call the API 
to verify the scheduler state periodically
We should leverage scheduler metrics to do the validation, things to validate 
like:
 # Negative resources on node/app/cluster
 # Consistency of the data, e.g sum of allocated resource of apps = allocated 
resource in the partition
 # critical errors logged in the metrics (things should not happen but happened)
 # ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to