On Monday 05 May 2003 19:31, Federico Sacerdoti wrote:
> On Monday, May 5, 2003, at 05:07 AM, Martin Knoblauch wrote:
> > Hi,
> >
> >  today I upgraded one of our clusters from 2.5.1 to 2.5.3 (gmond,
> > gmetad and the web-frontend). Since then the log-files on the gmetad
> > node get filled with stuff like:
> >
> > headnode /usr/sbin/gmetad[22664]: RRD_update: illegal attempt to update
> > using time 1052131942 when last update time is 1052132440 (mini
> > mum one second step)
>
> I think I know what this is. We recently changed the RRD update logic
> to use CLUSTER LOCALTIME
> as the rrd timestamp. This was done in 2.5.3 I believe.
>
> Now if your gmetad has a data source which is another gmetad (port
> 8651), it will try to update its rrds multiple times with the same
> CLUSTER LOCALTIME. Why? Because gmetad only updates its XML every
> 20-30s.
>

 Hmm. There is only one gmetad in our setup. All sources come from variour 
gmond's (we have two clusters on ports 8650 and 8652).

> So it is possible for your gmetad to attempt to update its rrds twice
> with the same LOCALTIME timestamp, causing the errors you see in your
> logs.
>

 Actually I could get rid of the messages and the graphs are OK again. It 
turned out that in the course of a total rebuild of our cluster we forgot to 
synchronize the system clocks on (most of) the nodes. Times were pretty far 
away. Since I restarted ntpd, all looks OK.

> This is one of those hard-to-anticipate bugs which occur from
> unintended side effects to the system. To fix it, I believe we need to
> use the true localtime when updating rrds for which we are not the
> "authority" on. (The authority mode is off whenever we get our data
> from another gmetad).
>

 Seems the time-keeping is a bit touchy :-)

Martin
-- 
----------------------------------
Martin Knoblauch
[EMAIL PROTECTED]
http://www.knobisoft.de

Reply via email to