Re: [zenoss-users] 0.23 oddities?

DAVE CUSHING Sat, 04 Nov 2006 11:31:40 -0800

I figured I would take the next step and see if changing my RRD files to 300 
seconds with zenstep would change anything as well:


Traceback (most recent call last):
  File "/usr/local/zenoss/Products/ZenRRD/zenstep.py", line 132, in ?
    us.run()
  File "/usr/local/zenoss/Products/ZenRRD/zenstep.py", line 111, in run
    self.process(fullpath)
  File "/usr/local/zenoss/Products/ZenRRD/zenstep.py", line 91, in process
    rrdtool.update(newpath, *[('[EMAIL PROTECTED]' % (t, v)) for t, v in 
updates])
_rrdtool.error: illegal attempt to update using time 1162098000 when last 
update time is 1162100160 (minimum one second step)

Any ideas?


>>> On 11/4/2006 at 9:25 AM, in message <[EMAIL PROTECTED]>,
"DAVE CUSHING" <[EMAIL PROTECTED]> wrote:
> I have updated the server to request NTP updates every 5 minutes, and that 
> seems to have helped somewhat, but I am still getting lots of these error 
> messages in the zenperfsnmp log.
> 
>>>> On 11/4/2006 at 6:10 AM, in message <[EMAIL PROTECTED]>,
> "DAVE CUSHING" <[EMAIL PROTECTED]> wrote:
>> There are errors in the zenperfsnmp.log such as:
>> 
>> 2006-11-04 05:56:52 ERROR zen.RRDUtil: rrd error illegal attempt to update 
>> using
>>  time 1162637812 when last update time is 1162637895 (minimum one second 
>> step) D
>> evices/3116-3548XL-1/os/interfaces/FastEthernet0_39/ifOutErrors_ifOutErrors
>> 2006-11-04 05:56:52 ERROR zen.RRDUtil: rrd error illegal attempt to update 
>> using
>>  time 1162637812 when last update time is 1162637895 (minimum one second 
>> step) D
>> evices/3116-3548XL-1/os/interfaces/FastEthernet0_40/ifOutErrors_ifOutErrors
>> 
>> I am going to look at the clock timing on this particular VMWare machine and 
> 
>> see if that could be causing the problem.  If I am having clock sync 
> problems 
>> (coincidental with the upgrade), that could be the issue?
>> 
>> 
>> 
>>>>> On 11/3/2006 at 11:41 AM, in message <[EMAIL PROTECTED]>, Eric
>> Newton <[EMAIL PROTECTED]> wrote:
>>> Ok, so we need to figure out why:
>>> 
>>>     1) performance information is not getting into the RRD files reliably
>>>     2) there are no warnings/errors in $ZENHOME/log
>>>     3) there are no missing heartbeats
>>> 
>>> My guess is that the process is getting some error and that's not being 
>>> captured to the log file.
>>> 
>>> Try restarting zenperfsnmp, with stderr captured to a file:
>>> 
>>>     $ zenperfsnmp stop
>>>     $ zenperfsnmp run -v 10 --cycle 2>&1 >/tmp/zenperfsnmp.log
>>> 
>>> But really, I'm still stumped.
>>> 
>>> -Eric
>>> 
>>> 
>>> DAVE CUSHING wrote:
>>>> Yes I do get some gaps:
>>>>
>>>>                         <!-- 2006-10-31 04:36:00 EST / 1162287360 -->
>>>> <row><v> 1.5885392906e+01 </v></row>
>>>>                         <!-- 2006-10-31 09:24:00 EST / 1162304640 -->
>>>> <row><v> NaN </v></row>
>>>>                         <!-- 2006-10-31 14:12:00 EST / 1162321920 -->
>>>> <row><v> NaN </v></row>
>>>>                         <!-- 2006-10-31 19:00:00 EST / 1162339200 -->
>>>> <row><v> 4.9858200898e+00 </v></row>
>>>>                         <!-- 2006-10-31 23:48:00 EST / 1162356480 -->
>>>> <row><v> NaN </v></row>
>>>>                         <!-- 2006-11-01 04:36:00 EST / 1162373760 -->
>>>> <row><v> 1.9944078383e+01 </v></row>
>>>>                         <!-- 2006-11-01 09:24:00 EST / 1162391040 -->
>>>> <row><v> 1.4984374714e+01 </v></row>
>>>>                         <!-- 2006-11-01 14:12:00 EST / 1162408320 -->
>>>> <row><v> NaN </v></row>
>>>>                         <!-- 2006-11-01 19:00:00 EST / 1162425600 -->
>>>> <row><v> NaN </v></row>
>>>>
>>>>
>>>> I am ready to buy a vowel now :)
>>>>
>>>>
>>>>
>>>>   
>>>>>>> On 11/1/2006 at 4:18 PM, in message
>>>>>>>         
>>>> <[EMAIL PROTECTED]>, Eric
>>>> Newton <[EMAIL PROTECTED]> wrote:
>>>>   
>>>>> This is very strange.
>>>>>
>>>>> For any other people following along: you didn't need to run zenstep
>>>>>     
>>>> to 
>>>>   
>>>>> stay at the 60-second cycle, but you do need to change your cycle
>>>>>     
>>>> time 
>>>>   
>>>>> feed the RRD files the rate to which they were accustomed, and for 
>>>>> setting the rate for future RRD file creation.
>>>>>
>>>>> Check the data... do you have numbers in the rrdfile?  Try:
>>>>>
>>>>>     $ rrdtool dump foo_foo.rrd
>>>>>
>>>>> You are looking for gaps of NaN between numeric values in the first
>>>>>     
>>>> list 
>>>>   
>>>>> of numbers.   You will see what I mean when you look at it.  If you 
>>>>> like, zip up the rrd file and send it to me (off list) and I'll take
>>>>>     
>>>> a look.
>>>>   
>>>>> You aren't seeing any heartbeat failures, are you?
>>>>>
>>>>> -Eric
>>>>>
>>>>> DAVE CUSHING wrote:
>>>>>     
>>>>>> No, there doesn't seem to be any real regularity to it.  Here is a
>>>>>> screen shot from an eth0 interface on one of the Linux boxes - all
>>>>>>       
>>>> the
>>>>   
>>>>>> items that I monitor have the same gaps in the same places.  There
>>>>>>       
>>>> is no
>>>>   
>>>>>> unusual network activity that would account for the gaps, so I am at
>>>>>>       
>>>> a
>>>>   
>>>>>> loss as to what would have changed between 0.22 and 0.23.
>>>>>>
>>>>>> I did do a zenstep --step=60 --commit when I did the upgrade to
>>>>>>       
>>>> 0.23
>>>>   
>>>>>> In Monitors -> Performance -> localhost my SNMP cycle interval is
>>>>>>       
>>>> set
>>>>   
>>>>>> to 30 seconds with the Config Cycle interval set to 30 minutes.
>>>>>>
>>>>>> It may be worthwhile to know, that eventhough the graphing isn't
>>>>>> occuring, I am not getting flooded with alarms from these servers,
>>>>>>       
>>>> so
>>>>   
>>>>>> the monitoring must be occuring on some level.
>>>>>>
>>>>>> Thanks for the help.
>>>>>>
>>>>>>   
>>>>>>       
>>>>>>>>> On 11/1/2006 at 11:35 AM, in message
>>>>>>>>>         
>>>>>>>>>             
>>>>>> <[EMAIL PROTECTED]>, Eric
>>>>>> Newton <[EMAIL PROTECTED]> wrote:
>>>>>>   
>>>>>>       
>>>>>>> Hey Dave,
>>>>>>>
>>>>>>> Using a shorter polling time shouldn't be a problem.  I run mine
>>>>>>>     
>>>>>>>         
>>>>>> every 
>>>>>>   
>>>>>>       
>>>>>>> 10 seconds.
>>>>>>>
>>>>>>> Are your polling gaps 30 minutes apart by any chance?  Do you see
>>>>>>>         
>>>> any
>>>>   
>>>>>>>     
>>>>>>>         
>>>>>>   
>>>>>>       
>>>>>>> regularity in those gaps?
>>>>>>>
>>>>>>> -Eric
>>>>>>>
>>>>>>>
>>>>>>> DAVE CUSHING wrote:
>>>>>>>     
>>>>>>>         
>>>>>>>> I have noticed that my disk utilization on *nix based server is
>>>>>>>>           
>>>> no
>>>>   
>>>>>>>> longer being reported.  ZenOSS can see the total size of the
>>>>>>>>           
>>>> volume,
>>>>   
>>>>>>>>       
>>>>>>>>           
>>>>>> but
>>>>>>   
>>>>>>       
>>>>>>>> does not report utilization.  Windows based servers seem to be
>>>>>>>>       
>>>>>>>>           
>>>>>> fine.
>>>>>>   
>>>>>>       
>>>>>>>> CPU utilization reporting has changed, and it is acting strangely
>>>>>>>>           
>>>> on
>>>>   
>>>>>>>>       
>>>>>>>>           
>>>>>> my
>>>>>>   
>>>>>>       
>>>>>>>> *nix servers as well.  CPU utilization seems to be based on idle
>>>>>>>>       
>>>>>>>>           
>>>>>> time
>>>>>>   
>>>>>>       
>>>>>>>> now, rather than utilization, and I have many, many alerts (and
>>>>>>>> subsequent clears) each day about servers that are reporting 0 in
>>>>>>>>       
>>>>>>>>           
>>>>>> idle
>>>>>>   
>>>>>>       
>>>>>>>> time.  These appear to be false positives.
>>>>>>>>
>>>>>>>> Perhaps related to this utilization problem, is the fact that I
>>>>>>>>       
>>>>>>>>           
>>>>>> kept
>>>>>>   
>>>>>>       
>>>>>>>> the 60 second cycle for polling my devices.  It wasn't a problem
>>>>>>>>           
>>>> in
>>>>   
>>>>>>>>       
>>>>>>>>           
>>>>>> the
>>>>>>   
>>>>>>       
>>>>>>>> past, but I seem to have gaps in my performance graphs that I did
>>>>>>>>       
>>>>>>>>           
>>>>>> not
>>>>>>   
>>>>>>       
>>>>>>>> have before.  I think that it because it is missing some polls,
>>>>>>>>       
>>>>>>>>           
>>>>>> although
>>>>>>   
>>>>>>       
>>>>>>>> I see no indication of that in the log files.
>>>>>>>>
>>>>>>>> Any ideas?  I would prefer not to increase the polling time, but
>>>>>>>>           
>>>> if
>>>>   
>>>>>>>>       
>>>>>>>>           
>>>>>> it
>>>>>>   
>>>>>>       
>>>>>>>> will solve the problems, I will.
>>>>>>>>
>>>>>>>> Thanks in advance for any advice.
>>>>>>>>
>>>>>>>>   
>>>>>>>>       
>>> 
>>> _______________________________________________
>>> zenoss-users mailing list
>>> [email protected] 
>>> http://lists.zenoss.org/mailman/listinfo/zenoss-users 
>> _______________________________________________
>> zenoss-users mailing list
>> [email protected] 
>> http://lists.zenoss.org/mailman/listinfo/zenoss-users 
> _______________________________________________
> zenoss-users mailing list
> [email protected] 
> http://lists.zenoss.org/mailman/listinfo/zenoss-users
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users

Re: [zenoss-users] 0.23 oddities?

Reply via email to