If I understand the current behavior correctly, the "Load Avg" for say the 1-min load is simply the sum of the 1-min loads on all nodes divided by the total number of processors on all nodes.

Personally, I find this value useful because it is a relatively good indication of how much of our cluster is being utilitzed by user submitted jobs. The 1-min load is (for the most part) roughly equal to the number of running user jobs on the nodes. The averages for the 5-min and 15-min loads however are not so useful.

I would vote to make it a config option, but any solution that allows me to keep the current behaviour would be fine.

-- Rick

--------------------------
Rick Mohr
Systems Developer
Ohio Supercomputer Center

On Thu, 15 Dec 2005, Jason A. Smith wrote:

I was thinking that if some people would prefer the current behavior, it
could either be made into an admin config option or somehow user
selectable on the web frontend.  A third option, like you said, would be
to display both.  Anyone have any thoughts on which is the best choice
and how to do it.

~Jason


On Wed, 2005-12-14 at 18:22 -0700, Ian Cunningham wrote:
Jason,

The page should explain that the load averages given are historic. I am
not sure if the average of the 15 min average is meaningful for long
periods of time. Also its possible that there should be values given for
both current and historic.

Thanks for your work,
Ian

Jason A. Smith wrote:

The Avg Load percentages on the ganglia web frontend currently show the
latest measured values for the grid/cluster.  When looking at historical
data, these numbers can be misleading when compared to the graphs right
next to them.  I created a patch which changes this behavior by using
rrdtool to calculate the average loads over the displayed time range
instead of the latest value, see attachment.  Any comments, suggestions?

~Jason




Reply via email to