[ https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637915#comment-16637915 ]
Wilfred Spiegelenburg commented on YARN-8842: --------------------------------------------- Thank you for splitting this of from the pre-emption jira. I tried to apply the patch on top of YARN-8750 but it did not work... Beside that I went through the code as a diff and already have a couple of remarks: * You are always creating a new QueueMetricsForCustomResources. In the case that we just have two resources (mem and vcores) we will never use it and all metrics will stay zero. We do have the full overhead of making all the calls. From a performance perspective would it not be better to determine if we have custom resources configured in the constructor and set a flag to update the custom metrics or not? We can then make updating the custom resource metrics conditional which would keep the performance impact in most clusters to a minimum. * In {{updatePreemptedSecondsForCustomResources()}} you directly set the parent metrics but you do not take into account that there might be multiple levels above the queue you are working on. We need to call {{parent.updatePreemptedSecondsForCustomResources()}} * In {{getAllocatedResources()}} you're using BuilderUtils if you have no custom resources and a new resource object directly if you have. We should either use BuilderUtils in both cases or neither for consistency. * Again {{getAllocatedResources()}} when we have custom resource types (even if none are allocated) I would expect a metrics object back showing all the resource types not just memory and vcores. There should also be no need for {{isThereAnyAllocatedResource}} the Resource handles that itself. * In the test cases I could not find any multiple queue levels tests. root -> parent -> leaf. At least we should check that case with the root queue metrics to catch the issue in the {{updatePreemptedSecondsForCustomResources}} case from above. I'll dig further into the tests when we have a patch that applies. > Update QueueMetrics with custom resource values > ------------------------------------------------ > > Key: YARN-8842 > URL: https://issues.apache.org/jira/browse/YARN-8842 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Szilard Nemeth > Assignee: Szilard Nemeth > Priority: Major > Attachments: YARN-8842.001.patch > > > This is the 2nd dependent jira of YARN-8059. > As updating the metrics is an independent step from handling preemption, this > jira only deals with the queue metrics update of custom resources. > The following metrics should be updated: > * allocated resources > * available resources > * pending resources > * reserved resources > * aggregate seconds preempted -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org