Chaoran Yu created YUNIKORN-829:
-----------------------------------

             Summary: Produce metrics on queue-level resource utilization
                 Key: YUNIKORN-829
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-829
             Project: Apache YuniKorn
          Issue Type: New Feature
          Components: core - scheduler, shim - kubernetes
            Reporter: Chaoran Yu


YuniKorn already has metrics on the resources requested/allocated for each 
queue. But we have no visibility into how much of the allocated resources are 
actually being used. Take Spark as an example, an under-optimized job may 
request 1 TB of total executor memory but the actual processing logic only uses 
100 GB. This has the consequence that other jobs might not be able to fit in 
the queue. Having a metric that shows the real utilization will help members of 
a queue better understand their job characteristics and optimize the jobs.

K8s metrics server has metrics on real utilization. YK may be able to perform 
some aggregations to arrive at the stats at the queue level. This is a 
k8s-specific solution though.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to