I figured I would take the next step and see if changing my RRD files to 300
seconds with zenstep would change anything as well:
Traceback (most recent call last):
File "/usr/local/zenoss/Products/ZenRRD/zenstep.py", line 132, in ?
us.run()
File "/usr/local/zenoss/Products/ZenRRD/zenstep.py", line 111, in run
self.process(fullpath)
File "/usr/local/zenoss/Products/ZenRRD/zenstep.py", line 91, in process
rrdtool.update(newpath, *[('[EMAIL PROTECTED]' % (t, v)) for t, v in
updates])
_rrdtool.error: illegal attempt to update using time 1162098000 when last
update time is 1162100160 (minimum one second step)
Any ideas?
>>> On 11/4/2006 at 9:25 AM, in message <[EMAIL PROTECTED]>,
"DAVE CUSHING" <[EMAIL PROTECTED]> wrote:
> I have updated the server to request NTP updates every 5 minutes, and that
> seems to have helped somewhat, but I am still getting lots of these error
> messages in the zenperfsnmp log.
>
>>>> On 11/4/2006 at 6:10 AM, in message <[EMAIL PROTECTED]>,
> "DAVE CUSHING" <[EMAIL PROTECTED]> wrote:
>> There are errors in the zenperfsnmp.log such as:
>>
>> 2006-11-04 05:56:52 ERROR zen.RRDUtil: rrd error illegal attempt to update
>> using
>> time 1162637812 when last update time is 1162637895 (minimum one second
>> step) D
>> evices/3116-3548XL-1/os/interfaces/FastEthernet0_39/ifOutErrors_ifOutErrors
>> 2006-11-04 05:56:52 ERROR zen.RRDUtil: rrd error illegal attempt to update
>> using
>> time 1162637812 when last update time is 1162637895 (minimum one second
>> step) D
>> evices/3116-3548XL-1/os/interfaces/FastEthernet0_40/ifOutErrors_ifOutErrors
>>
>> I am going to look at the clock timing on this particular VMWare machine and
>
>> see if that could be causing the problem. If I am having clock sync
> problems
>> (coincidental with the upgrade), that could be the issue?
>>
>>
>>
>>>>> On 11/3/2006 at 11:41 AM, in message <[EMAIL PROTECTED]>, Eric
>> Newton <[EMAIL PROTECTED]> wrote:
>>> Ok, so we need to figure out why:
>>>
>>> 1) performance information is not getting into the RRD files reliably
>>> 2) there are no warnings/errors in $ZENHOME/log
>>> 3) there are no missing heartbeats
>>>
>>> My guess is that the process is getting some error and that's not being
>>> captured to the log file.
>>>
>>> Try restarting zenperfsnmp, with stderr captured to a file:
>>>
>>> $ zenperfsnmp stop
>>> $ zenperfsnmp run -v 10 --cycle 2>&1 >/tmp/zenperfsnmp.log
>>>
>>> But really, I'm still stumped.
>>>
>>> -Eric
>>>
>>>
>>> DAVE CUSHING wrote:
>>>> Yes I do get some gaps:
>>>>
>>>> <!-- 2006-10-31 04:36:00 EST / 1162287360 -->
>>>> <row><v> 1.5885392906e+01 </v></row>
>>>> <!-- 2006-10-31 09:24:00 EST / 1162304640 -->
>>>> <row><v> NaN </v></row>
>>>> <!-- 2006-10-31 14:12:00 EST / 1162321920 -->
>>>> <row><v> NaN </v></row>
>>>> <!-- 2006-10-31 19:00:00 EST / 1162339200 -->
>>>> <row><v> 4.9858200898e+00 </v></row>
>>>> <!-- 2006-10-31 23:48:00 EST / 1162356480 -->
>>>> <row><v> NaN </v></row>
>>>> <!-- 2006-11-01 04:36:00 EST / 1162373760 -->
>>>> <row><v> 1.9944078383e+01 </v></row>
>>>> <!-- 2006-11-01 09:24:00 EST / 1162391040 -->
>>>> <row><v> 1.4984374714e+01 </v></row>
>>>> <!-- 2006-11-01 14:12:00 EST / 1162408320 -->
>>>> <row><v> NaN </v></row>
>>>> <!-- 2006-11-01 19:00:00 EST / 1162425600 -->
>>>> <row><v> NaN </v></row>
>>>>
>>>>
>>>> I am ready to buy a vowel now :)
>>>>
>>>>
>>>>
>>>>
>>>>>>> On 11/1/2006 at 4:18 PM, in message
>>>>>>>
>>>> <[EMAIL PROTECTED]>, Eric
>>>> Newton <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> This is very strange.
>>>>>
>>>>> For any other people following along: you didn't need to run zenstep
>>>>>
>>>> to
>>>>
>>>>> stay at the 60-second cycle, but you do need to change your cycle
>>>>>
>>>> time
>>>>
>>>>> feed the RRD files the rate to which they were accustomed, and for
>>>>> setting the rate for future RRD file creation.
>>>>>
>>>>> Check the data... do you have numbers in the rrdfile? Try:
>>>>>
>>>>> $ rrdtool dump foo_foo.rrd
>>>>>
>>>>> You are looking for gaps of NaN between numeric values in the first
>>>>>
>>>> list
>>>>
>>>>> of numbers. You will see what I mean when you look at it. If you
>>>>> like, zip up the rrd file and send it to me (off list) and I'll take
>>>>>
>>>> a look.
>>>>
>>>>> You aren't seeing any heartbeat failures, are you?
>>>>>
>>>>> -Eric
>>>>>
>>>>> DAVE CUSHING wrote:
>>>>>
>>>>>> No, there doesn't seem to be any real regularity to it. Here is a
>>>>>> screen shot from an eth0 interface on one of the Linux boxes - all
>>>>>>
>>>> the
>>>>
>>>>>> items that I monitor have the same gaps in the same places. There
>>>>>>
>>>> is no
>>>>
>>>>>> unusual network activity that would account for the gaps, so I am at
>>>>>>
>>>> a
>>>>
>>>>>> loss as to what would have changed between 0.22 and 0.23.
>>>>>>
>>>>>> I did do a zenstep --step=60 --commit when I did the upgrade to
>>>>>>
>>>> 0.23
>>>>
>>>>>> In Monitors -> Performance -> localhost my SNMP cycle interval is
>>>>>>
>>>> set
>>>>
>>>>>> to 30 seconds with the Config Cycle interval set to 30 minutes.
>>>>>>
>>>>>> It may be worthwhile to know, that eventhough the graphing isn't
>>>>>> occuring, I am not getting flooded with alarms from these servers,
>>>>>>
>>>> so
>>>>
>>>>>> the monitoring must be occuring on some level.
>>>>>>
>>>>>> Thanks for the help.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>> On 11/1/2006 at 11:35 AM, in message
>>>>>>>>>
>>>>>>>>>
>>>>>> <[EMAIL PROTECTED]>, Eric
>>>>>> Newton <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>
>>>>>>> Hey Dave,
>>>>>>>
>>>>>>> Using a shorter polling time shouldn't be a problem. I run mine
>>>>>>>
>>>>>>>
>>>>>> every
>>>>>>
>>>>>>
>>>>>>> 10 seconds.
>>>>>>>
>>>>>>> Are your polling gaps 30 minutes apart by any chance? Do you see
>>>>>>>
>>>> any
>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>> regularity in those gaps?
>>>>>>>
>>>>>>> -Eric
>>>>>>>
>>>>>>>
>>>>>>> DAVE CUSHING wrote:
>>>>>>>
>>>>>>>
>>>>>>>> I have noticed that my disk utilization on *nix based server is
>>>>>>>>
>>>> no
>>>>
>>>>>>>> longer being reported. ZenOSS can see the total size of the
>>>>>>>>
>>>> volume,
>>>>
>>>>>>>>
>>>>>>>>
>>>>>> but
>>>>>>
>>>>>>
>>>>>>>> does not report utilization. Windows based servers seem to be
>>>>>>>>
>>>>>>>>
>>>>>> fine.
>>>>>>
>>>>>>
>>>>>>>> CPU utilization reporting has changed, and it is acting strangely
>>>>>>>>
>>>> on
>>>>
>>>>>>>>
>>>>>>>>
>>>>>> my
>>>>>>
>>>>>>
>>>>>>>> *nix servers as well. CPU utilization seems to be based on idle
>>>>>>>>
>>>>>>>>
>>>>>> time
>>>>>>
>>>>>>
>>>>>>>> now, rather than utilization, and I have many, many alerts (and
>>>>>>>> subsequent clears) each day about servers that are reporting 0 in
>>>>>>>>
>>>>>>>>
>>>>>> idle
>>>>>>
>>>>>>
>>>>>>>> time. These appear to be false positives.
>>>>>>>>
>>>>>>>> Perhaps related to this utilization problem, is the fact that I
>>>>>>>>
>>>>>>>>
>>>>>> kept
>>>>>>
>>>>>>
>>>>>>>> the 60 second cycle for polling my devices. It wasn't a problem
>>>>>>>>
>>>> in
>>>>
>>>>>>>>
>>>>>>>>
>>>>>> the
>>>>>>
>>>>>>
>>>>>>>> past, but I seem to have gaps in my performance graphs that I did
>>>>>>>>
>>>>>>>>
>>>>>> not
>>>>>>
>>>>>>
>>>>>>>> have before. I think that it because it is missing some polls,
>>>>>>>>
>>>>>>>>
>>>>>> although
>>>>>>
>>>>>>
>>>>>>>> I see no indication of that in the log files.
>>>>>>>>
>>>>>>>> Any ideas? I would prefer not to increase the polling time, but
>>>>>>>>
>>>> if
>>>>
>>>>>>>>
>>>>>>>>
>>>>>> it
>>>>>>
>>>>>>
>>>>>>>> will solve the problems, I will.
>>>>>>>>
>>>>>>>> Thanks in advance for any advice.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>> _______________________________________________
>>> zenoss-users mailing list
>>> [email protected]
>>> http://lists.zenoss.org/mailman/listinfo/zenoss-users
>> _______________________________________________
>> zenoss-users mailing list
>> [email protected]
>> http://lists.zenoss.org/mailman/listinfo/zenoss-users
> _______________________________________________
> zenoss-users mailing list
> [email protected]
> http://lists.zenoss.org/mailman/listinfo/zenoss-users
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users