It seems that the problem stems from the aixdisk.conf and its C code.  I 
renamed aixdisk.conf and restarted gmond on all my hosts and gmond has stayed 
up for over 12 hours.
If anyone needs the core file, let me know!  Thx!

From: Derek Smith
Sent: Tuesday, September 17, 2013 02:07 PM
To: ganglia-general@lists.sourceforge.net
Subject: gmond core dumping, again on head node.

Ever since my upgrade to 3.6 gmond is very shaky to say the least...gmond keeps 
seg faulting.  I have the core file if needed!  Any help much appreciated!
Thank you!



My ENV is:

AIX
6100-08-03-1339
gmond 3.6.0
gmetad 3.6.0
web front-end "3.5.10";
Server version: Apache/2.4.3 (Unix)
RRDtool 1.4.8  Copyright 1997-2013 by Tobias Oetiker 
<t...@oetiker.ch<mailto:t...@oetiker.ch>>
gmond rrdcache: "/var/lib/ganglia/rrdcached/rrdcached.socket";
gmetad rrdcache: RRDCACHED_ADDRESS=/var/lib/ganglia/rrdcached/rrdcached.socket


Error report details

# cat php-errors.log
[05-Sep-2013 13:59:26 America/Detroit] PHP Notice:  Undefined index: hreg in 
/var/www/htdocs/ganglia3510/ganglia-web-3.5.10/graph_all_periods.php on line 84
[05-Sep-2013 14:05:06 America/Detroit] PHP Notice:  Undefined index: hreg in 
/var/www/htdocs/ganglia3510/ganglia-web-3.5.10/graph_all_periods.php on line 843


CORE FILE NAME
/var/adm/ras/corefiles/core.9371670.17154125
PROGRAM NAME
gmond
STACK EXECUTION DISABLED
           0
COME FROM ADDRESS REGISTER
rmgr_disa FFFFF9B4

PROCESSOR ID
  hw_fru_id: 0
  hw_cpu_id: 4

ADDITIONAL INFORMATION
extend_br 238
extend_br 1E8

Symptom Data
REPORTABLE
1
INTERNAL ERROR
0
SYMPTOM CODE
PCSS/SPI2 FLDS/gmond SIG/11 FLDS/extend_br VALU/238 FLDS/rmgr_disa


Syslog details, core dump 1215-ish ESDT

Sep 17 12:14:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:14:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:14:45 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:14:45 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:14:57 ganglia01ap daemon:info xntpd[4063412]: synchronized to 
10.1.1.200, stratum=1
Sep 17 12:15:00 ganglia01ap daemon:notice ConfigRM[7340166]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference ID:
  :::Template ID: de84c4db:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,347                 :::CONFIGRM_STARTED_S
T IBM.ConfigRM daemon has started.
Sep 17 12:15:00 ganglia01ap daemon:err|error ConfigRM[7340166]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference
ID:  :::Template ID: 6895a4e3:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,506                 :::CONFIGRM_ERROR_
ER An internal error was encountered in the configuration manager daemon 
(IBM.ConfigRMd). Error Code 00018001 Message Catalo
g Name ct_rmf.cat Message Set 1 Message Identifier 7 Message Inserts 00000005
Sep 17 12:15:00 ganglia01ap daemon:notice ConfigRM[7340168]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference ID:
  :::Template ID: de84c4db:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,347                 :::CONFIGRM_STARTED_S
T IBM.ConfigRM daemon has started.
Sep 17 12:15:00 ganglia01ap daemon:err|error ConfigRM[7340168]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference
ID:  :::Template ID: 6895a4e3:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,506                 :::CONFIGRM_ERROR_
ER An internal error was encountered in the configuration manager daemon 
(IBM.ConfigRMd). Error Code 00018001 Message Catalo
g Name ct_rmf.cat Message Set 1 Message Identifier 7 Message Inserts 00000005
Sep 17 12:15:01 ganglia01ap daemon:notice ConfigRM[7340170]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference ID:
  :::Template ID: de84c4db:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,347                 :::CONFIGRM_STARTED_S
T IBM.ConfigRM daemon has started.
Sep 17 12:15:01 ganglia01ap daemon:err|error ConfigRM[7340170]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference
ID:  :::Template ID: 6895a4e3:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,506                 :::CONFIGRM_ERROR_
ER An internal error was encountered in the configuration manager daemon 
(IBM.ConfigRMd). Error Code 00018001 Message Catalo
g Name ct_rmf.cat Message Set 1 Message Identifier 7 Message Inserts 00000005
Sep 17 12:15:01 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:15:01 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:15:16 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:15:16 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:15:29 ganglia01ap daemon:info xntpd[4063412]: synchronized to 
10.1.1.201, stratum=1
Sep 17 12:15:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:15:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:15:46 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:15:46 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:16:01 ganglia01ap daemon:info xntpd[4063412]: synchronized to 
10.1.1.200, stratum=1
Sep 17 12:16:01 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:16:01 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:16:08 ganglia01ap aso:notice aso[15073350]: [HIB] Used entitlement 
per unfolded vCPU is below threshold (13% of a
core).
Sep 17 12:16:08 ganglia01ap aso:notice aso[15073350]: [HIB] Cache optimizations 
will hibernate until used entitlement is at
least 30% of a core per unfolded vCPU
Sep 17 12:16:16 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
Sep 17 12:16:16 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da
tasource
Sep 17 12:16:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod
e 10.255.9.12
------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to