On Mon, Jul 07, 2008 at 10:18:09AM +0200, Ulf wrote:
> Hi,
> 
> on my Linux testbox SLES10 SP2 64Bit (python-2.4.2-18.13), I get the 
> following error.
> gmond 3.1.0.1527
> # gmond -m
> [...]
> dev-rootvg-usr-disk_used        Used disk space (module python_module)
> swap_free       Amount of available swap memory (module mem_module)
> Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
>     self.run()
>   File "/usr/lib64/ganglia/python_modules/tcpconn.py", line 260, in run
>     self.popenChild.wait()
>   File "/usr/lib64/python2.4/popen2.py", line 94, in wait
>     pid, sts = os.waitpid(self.pid, 0)
> OSError: [Errno 10] No child processes

so the netstat command that was started by that module got killed somehow.
what version of python 2.4 do you have installed?, and could it be that you
have a lot of connections open in that server?

> The problem seems to be an timing problem, as the error occurs only every 
> second  or third call of gmond -m. When the error is not shown, gmond -m 
> waits some time after the last line swap_free.

that is a surprise (at least for me), gmond -m shouldn't need to wait as it is
just collecting available metrics, but in this case is probably that it is
blocked waiting for the thread that started that "netstat" call to finish, and
so both issues are related.

as a quick workaround to speed up that call (and therefore reduce the latency
and probability of timeouts), try the attached patch.

Carlo
---
Index: gmond/python_modules/network/tcpconn.py
===================================================================
--- gmond/python_modules/network/tcpconn.py     (revision 1527)
+++ gmond/python_modules/network/tcpconn.py     (working copy)
@@ -246,7 +246,7 @@
 
             #Call the netstat utility and split the output into separate lines
             fd_poll = select.poll()
-            self.popenChild = popen2.Popen3("netstat -t -a")
+            self.popenChild = popen2.Popen3("netstat -t -a -n")
             fd_poll.register(self.popenChild.fromchild)
 
             poll_events = fd_poll.poll()
-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to