Yunikorn metrics improvements

Peter Bacsko Fri, 28 Jan 2022 06:08:57 -0800

Hi all,

recently I've started to work on enhancing YK metrics. This simply means
collecting more data (counters, statistics, distributions, etc) that helps
debugging and troubleshooting various issues, mostly performance related. I
created YUNIKORN-1049 <https://issues.apache.org/jira/browse/YUNIKORN-1049> for
this purpose.


I had an idea that looking at what Hadoop YARN exposes as metrics (
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Metrics.html)
could be a good starting point because of the inherent similarities between
the two projects. Obviously YARN is more than just a scheduler, but it's
still useful as an input.

I documented my findings. It was originally an internal document, which I
now made public under YUNIKORN-1050
<https://issues.apache.org/jira/browse/YUNIKORN-1050>. It's nowhere near
complete, so feel free to take a look and add your suggestions/comments
(there are already some from Wilfred, Craig, Sunil). It's viewable for
everyone, but suggestions/edits are restricted so just ask for it.

I didn't want to make too many subtasks under the JIRA, so I only created
some generic ones, those can be broken down later if it's deemed necessary.

Thanks,
Peter

Yunikorn metrics improvements

Reply via email to