If I understand the current behavior correctly, the "Load Avg" for say the 1-min
load is simply the sum of the 1-min loads on all nodes divided by the total
number of processors on all nodes.
Personally, I find this value useful because it is a relatively good indication
of how much of our cluster is being utilitzed by user submitted jobs. The 1-min
load is (for the most part) roughly equal to the number of running user jobs on
the nodes. The averages for the 5-min and 15-min loads however are not so
useful.
I would vote to make it a config option, but any solution that allows me to keep
the current behaviour would be fine.
-- Rick
--------------------------
Rick Mohr
Systems Developer
Ohio Supercomputer Center
On Thu, 15 Dec 2005, Jason A. Smith wrote:
I was thinking that if some people would prefer the current behavior, it
could either be made into an admin config option or somehow user
selectable on the web frontend. A third option, like you said, would be
to display both. Anyone have any thoughts on which is the best choice
and how to do it.
~Jason
On Wed, 2005-12-14 at 18:22 -0700, Ian Cunningham wrote:
Jason,
The page should explain that the load averages given are historic. I am
not sure if the average of the 15 min average is meaningful for long
periods of time. Also its possible that there should be values given for
both current and historic.
Thanks for your work,
Ian
Jason A. Smith wrote:
The Avg Load percentages on the ganglia web frontend currently show the
latest measured values for the grid/cluster. When looking at historical
data, these numbers can be misleading when compared to the graphs right
next to them. I created a patch which changes this behavior by using
rrdtool to calculate the average loads over the displayed time range
instead of the latest value, see attachment. Any comments, suggestions?
~Jason