Matt Massie wrote:
do you mean that if you turn on debugging that gmond doesn't work
anymore?
Nope, the opposite:
If I turn *OFF* debugging, gmond doesn't work anymore.
With debugging on, gmond works as expected. But the lack of daemonization is
a bit of a drag.
When built from *this* 2.6.0 tarball, not so much ... works in debug though:
644383 Jun 3 12:55 ganglia-2.6.0.tar.gz
and when you build the previous snapshot that gmond ONLY works in debug
mode?
If you mean "previous 2.6.0 snapshot," that's the only one I have access to.
what version of glibc are you running on your boxes?
% rpm -qi glibc
sh-2.05# rpm -qi glibc
Name : glibc Relocations: (not relocateable)
Version : 2.2.4 Vendor: Red Hat, Inc.
Release : 19.3 Build Date: Sat Dec 8 06:14:53
2001Install date: Fri Dec 5 02:47:04 2003 Build Host:
stripples.devel.redhat.com
Group : System Environment/Libraries Source RPM:
glibc-2.2.4-19.3.src.rpm
Size : 18049874 License: LGPL
Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
i found that the pthread (LinuxThreads) implementation on linux is a
nightmare. sometimes you'll find the thread stuff in glibc other times
you'll find it in the kernel. you can force it to use older pthread
libraries by doing a...
% set LD_ASSUME_KERNEL="2.2.5"
before you start gmond.
To quote Zak McKracken and the Alien Mindbenders:
"That doesn't seem to work." :)
also, you are compiling gmond on the host it being run on?
Yes, for the purposes of this test I am compiling and running on the
aforementioned uniprocessor P4 Dell workstation. I've tried both this binary
and a rebuilt binary on the Opteron, no dice.
i think the problem is the way that signals are passed around in
threaded programs... older libraries used USR1 and USR2.. the newer
libraries use "real-time" signals.
That certainly could be the case. I'll see if I can get things going here but
threaded-app debugging isn't my favorite thing in the world so if someone
around here needs to move something heavy...
we may have to remove the thread pool code altogether and just have a
thread per channel (or put in a ./configure flag to override the pools
on broken machines).
I'll see if I can get my hands on other configurations on which I can test...
your message is timely. i was going to send an email out today and try
to get feedback on 2.6.0. so .. it looks like we have a trusted_hosts
IPv4 <=> IPv6 problem and a thread pool problem. any others?
Hats. People aren't wearing enough of them.