Hi,

I am working on a Cluster with 752 Nodes at the University of Freiburg 
(Germany).

In this setup gmetad is crashing with a segfault.
/var/log/messages:
Aug 16 20:07:47 monitor kernel: gmetad[38792]: segfault at 0 ip 
00007f9b5122d82c sp 00007f9b38a31af0 error 4 in 
libganglia.so.0.0.0[7f9b51222000+14000]
Aug 16 20:07:47 monitor systemd: gmetad.service: main process exited, 
code=killed, status=11/SEGV

System: CentOS Linux release 7.2.1511

Ganglia-Versions:
ganglia-web-3.7.1-2.el7.x86_64
ganglia-3.7.2-2.el7.x86_64
ganglia-gmond-3.7.2-2.el7.x86_64
ganglia-debuginfo-3.7.2-2.el7.x86_64
ganglia-gmetad-3.7.2-2.el7.x86_64

The ganglia configuration files are attached to this email.

The crash always happens when gmetad removes nodes that disappeared:
$ journalctl -u gmetad.service
...
Aug 23 16:58:30 monitor gmetad: Updating host n4262.nemo.privat, metric 
disk_total
Aug 23 16:58:30 monitor gmetad: Updating host n4262.nemo.privat, metric 
mem_shared
Aug 23 16:59:00 monitor journal: Suppressed 48289 messages from 
/system.slice/gmetad.service
Aug 23 16:59:00 monitor gmetad: Cleanup thread running...
Aug 23 16:59:00 monitor gmetad: Cleanup deleting host "n4385.nemo.privat" 
Aug 23 16:59:00 monitor gmetad: Cleanup deleting host "n4385.nemo.privat" 
Aug 23 16:59:00 monitor kernel: gmetad[68347]: segfault at 160 ip 
00007ffb0e9de9a6 sp 00007ffaf6b3da60 error 4 in 
libganglia.so.0.0.0[7ffb0e9d2000+16000]
Aug 23 16:59:00 monitor systemd: gmetad.service: main process exited, 
code=killed, status=11/SEGV
Aug 23 16:59:00 monitor systemd: Unit gmetad.service entered failed state.

I started gmetad with gdb to get more information:
gdb /usr/sbin/gmetad
(gdb) run -d 10 -c /etc/ganglia/gmetad.conf

The crash looks like this:
...
Writing Root Summary data for metric mem_shared
Writing Root Summary data for metric proc_run
Cleanup thread running...
Cleanup deleting host "n4527.nemo.privat" 
Cleanup deleting host "n4527.nemo.privat" 

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffdfb00700 (LWP 175149)]
0x00007ffff799f82c in hash_key (seed=0, len=<optimized out>, key=<optimized 
out>)
    at hash.c:182
182                     seed ^= (uint64_t)*bp++;

Additional gdb information:
(gdb) where
#0  0x00007ffff799f82c in hash_key (seed=0, len=<optimized out>, key=<optimized 
out>)
    at hash.c:182
#1  hashval (hash=0x7fffd9a5f290, key=<optimized out>) at hash.c:195
#2  hash_delete (key=<optimized out>, hash=hash@entry=0x7fffd9a5f290) at 
hash.c:335
#3  0x00007ffff799f927 in hash_destroy (hash=0x7fffd9a5f290) at hash.c:145
#4  0x000000000040f35c in cleanup_source (key=0x7fffd92140d0, 
val=0x7fffd92143d0, 
    arg=0x7fffdfaffc10) at cleanup.c:170
#5  0x00007ffff799f9d9 in hash_walkfrom (hash=0x62ffc0, from=<optimized out>, 
    func=0x40f219 <cleanup_source>, arg=0x7fffdfaffc10) at hash.c:402
#6  0x000000000040f50b in cleanup_thread (arg=0x0) at cleanup.c:206
#7  0x00007ffff635ddc5 in start_thread (arg=0x7fffdfb00700) at 
pthread_create.c:308
#8  0x00007ffff608aced in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113

(gdb) list
177             unsigned char *be = bp + len;           /* beyond end of buffer 
*/
178
179             /* FNV-1a hash; assume we have stdint.h available */
180             while (bp < be) {
181                     /* xor the bottom with the current octet */
182                     seed ^= (uint64_t)*bp++;
183                     /* multiply by the 64 bit FNV magic prime mod 2^64 */
184                     seed *= FNV_64_PRIME;
185             }
186

(gdb) print bp
$1 = (unsigned char *) 0x1 <Address 0x1 out of bounds>
(gdb) print be
$2 = (unsigned char *) 0x161 <Address 0x161 out of bounds>
(gdb) print seed
$3 = 0

best regards,
Konrad Meier

Attachment: ganglia-conf.tar.gz
Description: application/gzip

------------------------------------------------------------------------------
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to