Since the 4.9.0 upgrade, I see this popping up on all of my boxes: Jan 13 20:35:39 server collectd[8501]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/server/processes-httpd/ps_disk_octets.rrd) failed: not a simple integer: '-1719325917'
It's not happening every collectd interval but it looks like once every 3-7 intervals, and always seems to be the ps_disk_octets metric. Here's a grab of non-"nan" for "processes-httpd/ps_disk_octets.rrd MAX": 1263426420: 8.9397714667e+06 8.8867349333e+06 1263426480: 8.9397714667e+06 8.8867349333e+06 1263426540: 2.8722361756e+06 2.8611358122e+06 1263426600: 2.8722361756e+06 2.8611358122e+06 1263429840: 6.9840988933e+06 6.9629671933e+06 1263429900: 6.9840988933e+06 6.9629671933e+06 1263429960: 4.1813118933e+06 4.1651611100e+06 1263430020: 2.4956140633e+06 2.4853952000e+06 1263432180: 1.5469198067e+07 1.5460964333e+07 1263432240: 3.5285845300e+06 3.5101065467e+06 There are big holes there and the 'nan' rows are about 60% of the file. The biggest recorded value is 28858203.767. The lowest number reported in the error message is -2147483522 (ranges all the way up to -5). Presumably something's overflowing :) Other background: These are all Debian Etch, running collectd 4.9.0. They're all 32-bit boxes, all running fairly new linux kernels, all with "CONFIG_TASK_IO_ACCOUNTING=y". The example above is from a box running 2.6.32.3, but I see this happening on other boxes regardless of the kernel (even down to 2.6.29.x and beyond). The above example is a pretty heavily loaded web server. Though it's serving *only* read-only web traffic, it does write a good deal of logs out, so it's not impossible for it to have very high IO numbers. This isn't a big deal, just a minor annoyance, but I figured I'd mention it. _______________________________________________ collectd mailing list collectd@verplant.org http://mailman.verplant.org/listinfo/collectd