I have MRTG configured to write to rrd files and run as a daemon, polling over a dozen devices, and this has been working for quite some time. Yesterday, however, I pulled up my routers2.cgi front-end to look at the data, and found *most* of the graphs blank, with just a few showing completely unbelievable data (steadily climbing usage, for example). In trying to debug it, it looks like the problem is that the RRD files are not being written correctly, or else that MRTG is no longer getting the correct values to put in them:

- When running a command like "rrdtool fetch  AVERAGE", I get output like this:

1478885940: nan nan
1478886000: nan nan
1478886060: nan nan
1478886120: nan nan
1478886180: nan nan
1478886240: nan nan
1478886300: nan nan
1478886360: nan nan
1478886420: nan nan
1478886480: nan nan
1478886540: nan nan
1478886600: nan nan
1478886660: nan nan
1478886720: nan nan
1478886780: nan nan
1478886840: nan nan
1478886900: nan nan

Or, in a *few* isolated cases, I'll see something like this:

1478886180: nan nan
1478886240: nan nan
1478886300: nan nan
1478886360: nan nan
1478886420: 3.4949820633e+07 nan
1478886480: 3.4950327200e+07 nan
1478886540: 3.4950794517e+07 nan
1478886600: 3.4951350633e+07 nan
1478886660: 3.4951841283e+07 nan
1478886720: 3.4952208400e+07 nan
1478886780: 3.4952782750e+07 nan
1478886840: 3.4953294733e+07 nan
1478886900: 3.4953738400e+07 nan
1478886960: 3.4954303633e+07 nan
1478887020: 3.4954859750e+07 nan
1478887080: 3.4955419567e+07 nan
1478887140: 3.4956047750e+07 nan
1478887200: 3.4956552467e+07 nan
1478887260: 3.4957036567e+07 nan
1478887320: 3.4957632633e+07 nan
1478887380: 3.4958179633e+07 nan
1478887440: 3.4958717517e+07 nan
1478887500: nan nan
1478887560: nan nan

Where only the second number is nan (I can't claim to know what the numbers mean). 

When running MRTG with debug flags, the snpo flag seems to indicate proper polling operation on all targets, giving output like this:

2016-11-11 09:28:54 -- --snpo: (0) Name crosscheck OK
2016-11-11 09:28:54 -- --snpo: (1) Name crosscheck OK
2016-11-11 09:28:54 -- --snpo: (0) Confcache Match Gi0/2 -> .10102
2016-11-11 09:28:54 -- --snpo: (1) Confcache Match Gi0/2 -> .10102
2016-11-11 09:28:54 -- --snpo: SNMPGet from public@ -- ifName.10102,ifHCInOctets.10102,ifName.10102,ifHCOutOctets.10102
2016-11-11 09:28:54 -- --snpo: SNMPfound -- 'Gi0/7', '12796956213', 'Gi0/7', '93342632190'
2016-11-11 09:28:54 -- --snpo: (0) Name crosscheck OK
2016-11-11 09:28:54 -- --snpo: (1) Name crosscheck OK

...but turning on the "log" flag shows entries like the following:

2016-11-11 08:47:50 -- --log: RRDs::update(/var/www/mrtg/, '1478886468:34950240:805618082')
2016-11-11 08:47:50 -- --log:  got: 34949820.6333333/???
2016-11-11 08:47:50 -- --log: RRDs::update(/var/www/mrtg/, '1478886468:3971345709:119346021279')
2016-11-11 08:47:50 -- --log:  got: ???/???

Presumably the first one is where we get the single-number entries from, while the second is where we get the pure nan entries from. In either case, though, it doesn't appear to be working correctly.

I did try removing all the rrd files and letting MRTG re-create them, but it didn't seem to make a difference. Also, please keep in mind that this is happening across dozens of different devices - it's not just a single device.

All configuration files have been generated using carmaker, with commands like "/usr/local/bin/cfgmaker --ifref=name --noreversedns --ifdesc=alias --output=/etc/mrtg/main/ravnanc-sw13.cfg"

What might be causing this? How might I debug it? Thanks.

Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293

FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;

mrtg mailing list

Reply via email to