At the beginning of the month, I upgraded our system from
zenoss-2.1.2-0.el5.x86_64 to zenoss-2.3.2-178.el5.x86_64.  I had to run
the migrate step a few times by hand, but in the end everything seemed
to be working fine - only thing I noticed as a major difference is that
some events were either changed enough to make my event mapping
different, or their priorities were changed and they now show up when
they'd have disappeared before.

However, on Tuesday when I got to work I discovered the machine was
having serious troubles.  I realized the one partition was full, and
narrowed it down to a few multi-GB zendisc.log files.  After deleting
some of them (wanted one around so I might see what was wrong) I got the
system running happily again, but at 6pm when a discovery run was going
to happen it started flailing again.  Here's a logfile excerpt:

2009-01-21 06:20:16 INFO zen.ZenDisc: Connected to ZenHub
2009-01-21 06:20:18 INFO zen.ZenDisc: Connected to ZenHub
2009-01-21 06:20:18 WARNING zen.ZenDisc: No networks configured
2009-01-21 06:20:18 INFO zen.ZenDisc: Result: []
2009-01-21 06:20:18 INFO zen.ZenDisc: Scan time: 0.00 seconds
2009-01-21 06:20:18 INFO zen.ZenDisc: Scan time: 0.00 seconds
(repeated ad infinitum)
2009-01-21 09:37:08 INFO zen.ZenDisc: Scan time: 0.00 seconds
2009-01-21 09:37:08 INFO zen.ZenDisc: Scan time: 0.00 seconds
2009-01-21 09:37:08 INFO zen.ZenDisc: delete pidfile
/opt/zenoss/var/zendisc-localhost.pid
2009-01-21 09:37:08 INFO zen.ZenDisc: Daemon ZenDisc shutting down
2009-01-21 09:37:08 INFO zen.ZenDisc: Scan time: 0.00 seconds
2009-01-21 09:37:08 INFO zen.ZenDisc: zendisc shutting down

Obviously this isn't last night's run :>  After seeing it do this loop
again, I killed it and ran a discovery manually through the web
interface.  This worked fine, and in fact discovered some hosts that I
hadn't thought about which had been added (and should have been
discovered shortly after the upgrade, so it's probably been broken since
then).  Thinking that the manual run might have solved things, I deleted
the logfiles and waited for morning.  When I got to check it this
morning, the same problem again - and the above log excerpt to show what
it's doing.  Occasionally the 0.00 is 0.05 or thereabouts.

A Google search for similar problems only turned up people who haven't
yet added a network to their list - I have three, one of which is
discoverable (the other two are private networks reported by some of the
machines, but which zenoss wouldn't be able to scan if I wanted it to).

-- 
Steve Huston - W2SRH - Unix Sysadmin, Dept. of Astrophysical Sciences
  Princeton University  |    ICBM Address: 40.346525   -74.651285
    206 Peyton Hall     |"On my ship, the Rocinante, wheeling through
  Princeton, NJ   08544 | the galaxies; headed for the heart of Cygnus,
    (609) 258-7375      | headlong into mystery."  -Rush, 'Cygnus X-1'
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users

Reply via email to