Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-21 Thread Sam Barham
The debug build of 3.6.0 finally crashed over the weekend. The backtrace is: #0 0x7f042e4ba38c in hash_insert (key=0x7f0425bcc440, val=0x7f0425bcc430, hash=0x7239d0) at hash.c:233 #1 0x00408551 in startElement_METRIC (data=0x7f0425bcc770, el=0x733930 METRIC, attr=0x709270) at

Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-15 Thread Devon H. O'Dell
If you can install the dbg or dbgsym package for this, you can get more information. If you cannot do this, running: objdump -d `which gmond` | less in less: /40547c Paste a little context of the disassembly before and after that address, then scroll up and paste which function it's in. (That

Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-15 Thread Devon H. O'Dell
This is the prologue of some function and the second argument is NULL when it shouldn't be. Unfortunately, the binary does appear to be stripped, so it will be slightly hard to figure out which function it is. Your previous email with the backtrace shows that it is walking the hash tree

Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-14 Thread Sam Barham
I've finally managed to generate a core dump (the VM wasn't set up to do it yet), but it's 214Mb and doesn't seem to contain anything helpful - especially as I don't have debug symbols. The backtrace shows: #0 0x0040547c in ?? () #1 0x7f600a49a245 in hash_foreach () from

[Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-11 Thread Sam Barham
We are using Ganglia to monitoring our cloud infrastructure on Amazon AWS. Everything is working correctly (metrics are flowing etc), except that occasionally the gmetad process will segfault out of the blue. The gmetad process is running on an m3.medium EC2, and is monitoring about 50 servers.

Re: [Ganglia-general] gmetad segfaults after running for a while (on AWS EC2)

2014-09-11 Thread Devon H. O'Dell
Are you able to share a core file? 2014-09-11 14:32 GMT-07:00 Sam Barham s.bar...@adinstruments.com: We are using Ganglia to monitoring our cloud infrastructure on Amazon AWS. Everything is working correctly (metrics are flowing etc), except that occasionally the gmetad process will segfault