An excellent analysis! I also like your solution. A couple extra floating point operations per graph wont make a bit of difference in the integer-heavy PHP code.

I will incorporate your suggestion in the webfrontend CVS.

Federico

On Monday, March 31, 2003, at 04:22 AM, Phil Radden wrote:

I recently noticed that sometimes my 'month' graphs were being plotted
very blockily - with very few data points - at certain times. After a bit of diagnostic work using rrdtool fetch, I discovered that for a period of
maybe thirty minutes each day, it would return only 30ish data points,
instead of 240.  There then followed a mind-bending discussion on the
rrdtool developers list which you're welcome to dig out!

It turns out that, by design, rrdtool will favour 'coverage over
resolution' - so if your 'year' rra provides better coverage for the
requested interval than the 'month' rra, the year one will be used, even
though the month one is much higher resolution.

The (default) RRD definition used by ganglia is
  RRA:AVERAGE:0.5:1:240
  RRA:AVERAGE:0.5:24:240
  RRA:AVERAGE:0.5:168:240
  RRA:AVERAGE:0.5:672:240
  RRA:AVERAGE:0.5:5760:370
with a step of 15, and the problem comes because 5760 is not a multiple
of 672. rrdtool populates the rra in chunks, and will commit a new value
to, say, the 'month' rra, when (time % 672x15)==0.

However, sometimes,
  (time % 672x15) > (time % 5760x15)
meaning there is more 'uncommitted' time in the 'month' rra than in the
'year' rra, so the 'year' rra is more up to date.

And the catch comes because graph.php always requests time periods ending
NOW.  So, sometimes the 'year' rra gets closer to now than the 'month'
rra, and so it uses that instead and you get an ugly graph.  Hurrah!

There are two approaches to fixing this. The first is to change the RRD
definition, so each RRA uses a time period which is a multiple of the
previous RRAs.  This is a pain for those of us with many thousands of
RRDs!

My suggested approach is to amend graph.php so that, rather than 'now' as
the endpoint, it uses the most recent exact multiple of the sample time
period, such as 10080 (672x15) in the 'month' case.  This is guaranteed
to be covered by the highest resolution rra, so that will therefore be
used. You only need to apply this correction for 'month' plots given the
current RRD definition, but I've done all of them - it stop you getting
that blank final column :)

(My graph.php is locally modified, so here's example code rather than a
patch)

==============================================================
switch ($start) {
case -3600:     $round = 15; break;
case -86400:    $round = 360; break;
case -604800:   $round = 2520; break;
case -2419200:  $round = 10080; break;
case -31449600: $round = 86400; break;
default:        $round = 0;
}
if ($round>0) { $end = floor(time() / $round) * $round; }
         else { $end = "N"; }

$command = RRDTOOL . " graph - --start $start --end $end ".
[etc. etc.]
==============================================================

WARNING: our 'month' rra is only 28 days long; it was tempting to apply
the same backwards correction to the start time as the end time, but if
you do this, you'll fall off the _beginning_ of the rra, and once again
rrdtool will give you the year plot, instead of the month one!

Phil



-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Federico

Rocks Cluster Group, SDSC, San Diego, CA


Reply via email to