The only thing in /usr/local/etc/conf.d/ is modpython.conf. Given your guidance I think I've figured things out (I think). It does appear that the python modules get loaded twice (actually 3 times in my case). The time is in gmond.conf where I have it in the modules section:
modules { module { name = "core_metrics" } module { name = "python_module" path = "/usr/local/lib64/ganglia/modpython.so" params = "/usr/local/lib64/ganglia/python_modules/" } ... } At the end of /etc/ganglia/gmond.conf I have two include lines: include ("/usr/local/etc/conf.d/*.conf") include('/etc/ganglia/conf.d/*.pyconf') The first line includes the file /usr/local/etc/conf.d/modpython.conf. This file has the following lines: [root@home4 ganglia]# more /usr/local/etc/conf.d/modpython.conf /* params - path to the directory where mod_python should look for python metric modules the "pyconf" files in the include directory below will be scanned for configurations for those modules */ modules { module { name = "python_module" path = "modpython.so" params = "/usr/local/lib64/ganglia/python_modules" } } include ("/etc/ganglia/conf.d/*.pyconf") So it looks like the python modules get loaded 3 times (once for the first include, a second time for the include line in the file /usr/local/etc/conf.d/modpython.conf, and then a third time for the second include line in gmond.conf. Therefore, I erased the module lines in gmond.conf so that I don't load them. I also erased the include line at the end of gmond.conf pointing to /etc/ganglia/conf.d/*.pyconf. The only include line in gmond.conf is the following: include ("/usr/local/etc/conf.d/*.conf") You can find my current gmond.conf file here: http://pastebin.com/FJ2WAC4D In the file /usr/local/etc/conf.d/modpython.conf, I commented out the last line which is an include line pointing to /etc/ganglia/conf.d/*.pyconf. The file now simply reads: /* params - path to the directory where mod_python should look for python metric modules the "pyconf" files in the include directory below will be scanned for configurations for those modules */ modules { module { name = "python_module" path = "modpython.so" params = "/usr/local/lib64/ganglia/python_modules" } } I think all of this means that python modules only get loaded once when it gmond.conf does the include that points to /usr/local/etc/conf.d/*.conf Note - this file looks like: /* params - path to the directory where mod_python should look for python metric modules the "pyconf" files in the include directory below will be scanned for configurations for those modules */ modules { module { name = "python_module" path = "/usr/local/lib64/ganglia/modpython.so" params = "/usr/local/lib64/ganglia/python_modules" } } I think this should fix the problem so I tried running gmond interactively: /usr/local/sbin/gmond -d 5 -c /etc/ganglia/gmond.conf I still get a segfault. As an aside, this is just an experiment so I can learn about writing python modules in Ganglia. Therefore I'm not too concerned about the location of configuration files since it's temporary. But, I followed all of the defaults in ganglia about installing the code to /usr/local. I did create the directory /etc/ganglia since I wanted all ganglia related files to be in one location rather spread across all of /etc *it may not be "FHS compliant" but it's a practice I have developed over the years. In general I followed this blog: http://sachinsharm.wordpress.com/tag/installing-ganglia/ for building and installing ganglia. Everything worked just fine until I followed this blog http://sachinsharm.wordpress.com/2013/08/19/setup-and-configure-ganglia-python-modules-on-centosrhel-6-3/ for configuring Python modules. But I backed out all of the changes in that blog so that I was starting in a clean configuration. Thanks for the help! You have been very patient and I really appreciate it. Jeff Maciej Lasyk wrote: > Ok so from that I can see that you're including: > > include ("/usr/local/etc/conf.d/*.conf") > > include('/etc/ganglia/conf.d/*.pyconf') > > Could you recheck what conf files you have in /usr/local/etc/conf.d/ ? > > Next thing - why are you building those packages without setting any > proper (FHS like) directories > (http://pl.wikipedia.org/wiki/Filesystem_Hierarchy_Standard)? > > I'm almost sure that there is some configuration issue there > > On Sun, Feb 09, 2014 at 06:17:04PM -0500, Jeff Layton wrote: >> Sure thing - I appreciate the help. >> >> Build options: >> ./configure --with-gmetad >> >> gmond.conf: http://pastebin.com/ExiMgqv0 >> >> strace output: >> I ran the strace using the following command: >> >> strace -s 1024 -ff -o strace.log gmond -d 5 -c /etc/ganglia/gmond.conf >> >> The output of the thread that has the segfault in it was uploaded >> to pastebin: http://pastebin.com/xScMVU6P >> >> I had to erase the top 200 lines of the strace (too big and I'm not >> a pro user - yet :) ). >> >> But... just to be sure, I'm attaching the compressed tarball. Apologies >> to all but I just wanted to be sure. >> >> Once again - thanks a million! >> >> Jeff >> >> >> >>> Could you post here your build options (that ones you entered while >>> ./configure) and also could you paste gmond.conf into pastebin? >>> >>> Also plz strace one more time, but now with strace -s 1024 -e trace=file >>> and paste the output to pastebin >>> >>> >>> On Sun, Feb 09, 2014 at 04:55:14PM -0500, Jeff Layton wrote: >>>> I hope this isn't too much output (I've heard about pastebin.com >>>> but never really used it). >>>> >>>> >>>> [root@home4 ganglia-3.6.0]# ldd /usr/local/sbin/gmond >>>> linux-vdso.so.1 => (0x00007fff667f6000) >>>> libapr-1.so.0 => /usr/lib64/libapr-1.so.0 (0x00007f6a24049000) >>>> libresolv.so.2 => /lib64/libresolv.so.2 (0x000000337dc00000) >>>> libganglia-3.6.0.so.0 => >>>> /usr/local/lib64/libganglia-3.6.0.so.0 (0x00007f6a23e0d000) >>>> libdl.so.2 => /lib64/libdl.so.2 (0x000000337c400000) >>>> libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003390c00000) >>>> libz.so.1 => /lib64/libz.so.1 (0x000000337d000000) >>>> libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003f73600000) >>>> libexpat.so.1 => /lib64/libexpat.so.1 (0x000000337f800000) >>>> libconfuse.so.0 => /usr/lib64/libconfuse.so.0 (0x00007f6a23bff000) >>>> libpthread.so.0 => /lib64/libpthread.so.0 (0x000000337c800000) >>>> libc.so.6 => /lib64/libc.so.6 (0x000000337c000000) >>>> libuuid.so.1 => /lib64/libuuid.so.1 (0x0000003383800000) >>>> libcrypt.so.1 => /lib64/libcrypt.so.1 (0x000000338c600000) >>>> /lib64/ld-linux-x86-64.so.2 (0x000000337bc00000) >>>> libfreebl3.so => /usr/lib64/libfreebl3.so (0x000000338ca00000) >>>> >>>> >>>> Below is the tree output: >>>> >>>> >>>> [root@home4 ganglia-3.6.0]# tree /etc/ganglia >>>> /etc/ganglia >>>> ??? conf.d >>>> ? ??? procstat.pyconf >>>> ??? gmetad.conf >>>> ??? gmond.conf >>>> >>>> 1 directory, 3 files >>>> >>>> >>>> I looked at the strace file for process 3537 and I did see two places >>>> where gmond does an access() on the python_modules directory. >>>> Does gmond automatically look for the python modes so I don't need >>>> to put them the modules section of gmond.conf? >>>> >>>> Thanks a million! >>>> >>>> Jeff >>>> >>>> >>>>> Oh I didn't think about going that lowlevel :) Could you run ldd on >>>>> gmond also? >>>>> >>>>> Could you also run 'tree' command on /etc/ganglia ? It's interesting >>>>> that you have two times msg: "loaded module: python_module" while >>>>> starting gmond. Rechecking this with strace log shows that it looks like >>>>> double loading of those modules? http://pastebin.com/BjdCGgbj >>>>> >>>>> >>>>> On Sun, Feb 09, 2014 at 03:28:10PM -0500, Jeff Layton wrote: >>>>>> On 02/09/2014 02:48 PM, Jeff Layton wrote: >>>>>>> On 02/09/2014 02:28 PM, Maciej Lasyk wrote: >>>>>>> >>>>>>>> You could also try to catch on which particular check this segfault >>>>>>>> happens..? >>>>>>> Not sure how to check this. When I run gmond interactively, it >>>>>>> segfaults just after it says, >>>>>>> >>>>>>> [root@home4 yum.repos.d]# /usr/local/sbin/gmond -d 5 -c >>>>>>> /etc/ganglia/gmond.conf >>>>>>> loaded module: core_metrics >>>>>>> loaded module: python_module >>>>>>> loaded module: cpu_module >>>>>>> loaded module: disk_module >>>>>>> loaded module: load_module >>>>>>> loaded module: mem_module >>>>>>> loaded module: net_module >>>>>>> loaded module: proc_module >>>>>>> loaded module: sys_module >>>>>>> loaded module: python_module >>>>>>> Segmentation fault (core dumped) >>>>>>> >>>>>>> >>>>>>> I'm not sure where to begin checking. I'm a very old-fashioned >>>>>>> debugger - I tend to use a great deal of print statements to >>>>>>> track down where things are happening. I can start doing this >>>>>>> in gmond. >>>>>> I tried putting fprintf's all over the gmond.c (yep - I'm that >>>>>> poor of a debugger). I'm not sure but if looks like it segfaults >>>>>> in the function setup_metric_callbacks on the statement, >>>>>> >>>>>> if (modp->init && modp->init(global_context)) { >>>>>> >>>>>> or on the function, >>>>>> >>>>>> apr_pool_cleanup_register(global_context, modp, >>>>>> modular_metric_cleanup, >>>>>> apr_pool_cleanup_null); >>>>>> >>>>>> I'm not too sure. >>>>>> >>>>>> I apologize if I'm wasting your time with my poor debugging >>>>>> skills. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Jeff >>>>>> >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------------ Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers