Greetings, all.

Just started monitoring three Windows and one Linux server with Zenoss 2.0.3.  
Overall, very impressed.

I'm sure my network topology is somewhat unique.  My Windows servers are at a 
third-party hosting provider and sitting behind a firewall.  So I installed 
Hamachi VPN services on all my boxes.  If you're not familiar, Hamachi is a 
pretty cool free VPN/Virtual LAN tool (www.hamachi.cc).  It allows me to use 
SNMP and WMI monitoring securely and without poking holes in firewalls or 
dealing with overly complex VPN solutions.

As I've seen one or two other people report, I've been having a recurring issue 
with zenwin and zenwinmodeler.  When polling my Windows boxes, I'll 
occasionally get the following error:

Code:
ERROR_SEM_TIMEOUT: The semaphore timeout period has expired. (121)



This is a Windows error, I believe.  It's probably due to network slowness or 
maybe some vaguary of Hamachi, I don't know.  The main issue is that it seems 
to cause zenwin and zenwinmodeler to up and die.  Then I receive heartbeat 
failure emails until they restart themselves.

Here's an excerpt from my zenwinmodeler.log right when it happens:

Code:
2007-07-13 17:51:28 ERROR zen.zenwinmodeler: ERROR_SEM_TIMEOUT: The semaphore 
timeout period has expired. (121)
Traceback (most recent call last):
  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 63, in 
processLoop
    svcs = self.getServices(name, user, passwd)
  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 97, in 
getServices
    dev.connect()
  File "/usr/local/zenoss/Products/ZenWin/wmiclient.py", line 51, in connect
    self.flags,self.valueset)
  File "usr/local/zenoss/lib/python/win32com/client.py", line 33, in 
ConnectServer
    services = pywmi.WBEM_ConnectServer(name, namespace, user, passwd, locale, 
flags, authority, valueset)
com_error: com_error(121): DOS code 0x00000079
2007-07-13 17:51:28 INFO zen.zenwinmodeler: collecting from my1.server.com 
using user .\Administrator
2007-07-13 17:53:28 ERROR zen.zenwinmodeler: ERROR_SEM_TIMEOUT: The semaphore 
timeout period has expired. (121)
Traceback (most recent call last):
  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 63, in 
processLoop
    svcs = self.getServices(name, user, passwd)
  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 97, in 
getServices
    dev.connect()
  File "/usr/local/zenoss/Products/ZenWin/wmiclient.py", line 51, in connect
    self.flags,self.valueset)
  File "usr/local/zenoss/lib/python/win32com/client.py", line 33, in 
ConnectServer
    services = pywmi.WBEM_ConnectServer(name, namespace, user, passwd, locale, 
flags, authority, valueset)
com_error: com_error(121): DOS code 0x00000079



Also sprinkled throughout the logs are:

Code:
2007-07-13 17:54:28 WARNING zen.zenwinmodeler: skipping my1.server.com has bad 
wmi state
2007-07-13 17:55:28 WARNING zen.zenwinmodeler: skipping my1.server.com has bad 
wmi state
2007-07-13 17:56:28 WARNING zen.zenwinmodeler: skipping my1.server.com has bad 
wmi state



I do seem to be tracking performance metrics on these machines, so I'm not sure 
what the "bad wmi state" thing is about.

Anyway, I don't know what the solution is.  Maybe someone does.  Perhaps the 
Win32 Python libraries could be more forgiving with respect to their semaphore 
timeouts?  Is that configurable somewhere?

Also, are there some settings that could possibly be tweaked in the registry on 
the servers to help in this situation?

Thanks

------------------------
Max Edison




-------------------- m2f --------------------

Read this topic online here:
http://community.zenoss.com/forums/viewtopic.php?p=8721#8721

-------------------- m2f --------------------



_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users

Reply via email to