Re: [mrtg-developers] Fwd: Re: MRTG check scheduling while in daemon mode

Steve Shipway Tue, 05 Oct 2010 16:20:44 -0700

> > 2.       After the initial read of the CFG files, MRTG knows how many
> > Targets there are. Divide the Interval by this to get the interleave.
> 
>       I'd be inclined to use a bigger number than the Target count
>       (maybe twice the count or so), and so create some slack for
>       delays caused by equipment or network incidents.


This makes sense, so maybe that taking (Interval / #targets) as the interleave 
frequency, use (Interval*0.9 / #targets) to give a 10% headroom?

>       Adding deliberate jitter to the probe cycle might disturb
>       the interpolation of values for the nominal (interval-aligned)
>       probe instants, or at least give rise to "interesting" aliasing.

This is true; however if you keep the same schedule order for the targets over 
the interval (IE don’t recalculate every cycle) then you'll still have the 5min 
gap between pollings and so the jitter will be minimal, provided you don’t hit 
your forks limit.

> > it is hard to tell when you're reaching capacity,
> 
>       Depends on what is available by way of fork management.
>       Wouldn't it make sense to (try to) count elapsed, CPU, and
>       wait (disk and network) times for each fork, and derive some
>       estimate of remaining headroom?

I'd say you can always know your max forks (as defined by the Forks: command) 
and how many forks you're currently using (due to running checks not completing 
before the next interleave period completes) so you can identify capacity this 
way.

> > and (more importantly) it might be hard to do the optimisation
> > that MRTG does where a single device is queried once for all
> interfaces.
> 
>       Probably less a problem than it looks at first sight.
>       The grouping of Targets MRTG already does could surely
>       be exploited as input to the interleaving calculation.

Maybe; I know MRTG will bypass subsequent checks to a device if previous SNMP 
requests failed.  This might be harder to do in this new method because by the 
time the SNMP timeout hits, you've already kicked off new threads for the other 
Targets.    However since a SNMP thread in timeout uses minimal resources this 
might not be such an issue, though it would eat up threads...

> > We coded up basically this system here, however it didn't use MRTG in
> daemon
> > mode which negates a lot of the benefits you can gain from daemon
> mode
> 
>       Not only that, but retaining state from run to run may allow
>       Target 'reputation' (based on delays and retries) to be used
>       to tune the interleaving strategy for the actual environment.
>       Without daemon mode, this opportunity would either have to be
>       systematically foregone, or would require cacheing to disk.

Nice idea; if you have an array (preserved between cycles) that holds target 
processing order (with each item separated by the interleave time) then this 
could be re-ordered to optimise?  Of course, if you re-order it too much or too 
often, then you hit the jitter problem you mentioned earlier.

Maybe have this array hold targetname/failcount/skipnextcount; then a failed 
SNMP poll can cancel the /next/ poll for this device, and a subsequent fail can 
cancel the /next two/ polls, and so on...  If you get a fail for a specific 
target, then increment failcount for that target, and set 
skipnextcount=failcount for all targets on the same device.  Then at next 
cycle, if skipnextcount>0 you decrement skipnextcount and skip the poll.  If 
the poll succeeds, you set failcount=0 for all targetnames on this device.

Such as this pseudocode (note that since the poll is done in a separate thread 
the actual processing is a little more complex)

Foreach targetname in targetqueue  {
  If targetqueue[targetname].skipnextcount {
    targetqueue[targetname].skipnextcount--;
    next;
  }
  Poll targetname
  If success {
    Foreach t in targetqueue on same device as targetname {
      targetqueue[t].skipnextcount=0;
      targetqueue[t].failcount=0;     
    }
  } else {
    targetqueue[targetname].failcount++  ;
    targetqueue[targetname].skipnextcount = targetqueue[targetname].failcount;
    Foreach t in targetqueue on same device as targetname {
      targetqueue[t].skipnextcount = targetqueue[targetname].failcount;
    }  
  }
}

Steve

Steve Shipway
ITS Unix Services Design Lead
University of Auckland, New Zealand
Floor 1, 58 Symonds Street, Auckland
Phone: +64 (0)9 3737599 ext 86487
DDI: +64 (0)9 924 6487
Mobile: +64 (0)21 753 189
Email: [email protected]
 Please consider the environment before printing this e-mail 


_______________________________________________
mrtg-developers mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/mrtg-developers

Re: [mrtg-developers] Fwd: Re: MRTG check scheduling while in daemon mode

Reply via email to