Matt Massie wrote:
do you mean that if you turn on debugging that gmond doesn't work
anymore?

Nope, the opposite:

If I turn *OFF* debugging, gmond doesn't work anymore.

With debugging on, gmond works as expected. But the lack of daemonization is a bit of a drag.

When built from *this* 2.6.0 tarball, not so much ... works in debug though:

 644383 Jun  3 12:55 ganglia-2.6.0.tar.gz


and when you build the previous snapshot that gmond ONLY works in debug
mode?

If you mean "previous 2.6.0 snapshot," that's the only one I have access to.

what version of glibc are you running on your boxes?
% rpm -qi glibc

sh-2.05# rpm -qi glibc
Name        : glibc                        Relocations: (not relocateable)
Version     : 2.2.4                             Vendor: Red Hat, Inc.
Release : 19.3 Build Date: Sat Dec 8 06:14:53 2001Install date: Fri Dec 5 02:47:04 2003 Build Host: stripples.devel.redhat.com
Group       : System Environment/Libraries   Source RPM: 
glibc-2.2.4-19.3.src.rpm
Size        : 18049874                         License: LGPL
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>



i found that the pthread (LinuxThreads) implementation on linux is a
nightmare.  sometimes you'll find the thread stuff in glibc other times
you'll find it in the kernel.  you can force it to use older pthread
libraries by doing a...

% set LD_ASSUME_KERNEL="2.2.5"

before you start gmond.

To quote Zak McKracken and the Alien Mindbenders:

"That doesn't seem to work."  :)

also, you are compiling gmond on the host it being run on?

Yes, for the purposes of this test I am compiling and running on the aforementioned uniprocessor P4 Dell workstation. I've tried both this binary and a rebuilt binary on the Opteron, no dice.

i think the problem is the way that signals are passed around in
threaded programs... older libraries used USR1 and USR2.. the newer
libraries use "real-time" signals.

That certainly could be the case. I'll see if I can get things going here but threaded-app debugging isn't my favorite thing in the world so if someone around here needs to move something heavy...

we may have to remove the thread pool code altogether and just have a
thread per channel (or put in a ./configure flag to override the pools
on broken machines).

I'll see if I can get my hands on other configurations on which I can test...

your message is timely.  i was going to send an email out today and try
to get feedback on 2.6.0.  so .. it looks like we have a trusted_hosts
IPv4 <=> IPv6 problem and a thread pool problem. any others?

Hats.  People aren't wearing enough of them.



Reply via email to