Michael,

 could you please post a "diff -u" style patch, so that I can apply it
automatically (alternatively just send me your version of solaris.c).

 Robert: could you have a look at the change for sanity/correctness?
Apparently your previous fix did not take into account the case that
CPU #0 is missing. Unfortunatelly I have no box to test those cases
out.

Cheers
Martin


--- Michael Hom <[EMAIL PROTECTED]> wrote:
> OS: Solaris  5.8, 5.9
> File: "tarball/gmond/machines/solaris.c"
> Function:  get_kstat_val( )
> 
> I've been testing the latest ganglia 2.6.0 posted on March 22nd.
> I noticed that gmond segfaults & core dumps when the first online cpu
> is NOT in slot #0.
> When I do a "kstat cpu_stat", my cpus are instance #'s 1 & 3.
> The crash occurs during the ks = kstat_lookup( ).  It's hardcoded to
> look for cpu instance #0.
> 
> Anyways, I wrote the following to check the next cpu instance if the
> default (zero) returns NULL.
> So since most machines will likely have a cpu at slot #0, I left the
> initial check as is.
> However, if it fails, then start checking instance #1, #2,... etc
> until one is found.
> It's not pretty, but it appears to work.
> Here's a diff of my changes for solaris.c:
> 
> 4c4
> < #include "debug_msg.h"
> ---
> > #include "lib/debug_msg.h"
> 
> 229a230,250
> >
> >    /* CPU_INFO module & instance check
> >     * If the first online CPU is not in slot #0, gmond will
> segfault and core dump.
> >     * If ks == NULL, then traverse the cpu instance #'s until the
> first online cpu is found.
> >     * After the traversal, if ks == NULL, then something is very
> wrong with the cpu configuration.
> >     */
> >    if ((km_name == "cpu_info") && (ks == NULL))  {
> >          int instance_num = 1;
> >          int num_tries = 0;
> >          while ((num_tries < (int)metriclist.cpu_num.uint32) && (ks
> == NULL)) {
> >              char *new_ks_name = (char *)malloc(20*sizeof(char));
> >              sprintf(new_ks_name, "cpu_info%d", instance_num);
> >              debug_msg( "Lookup up kstat:  km (unix?)='%s', ks
> (system_misc?)='%s',kn (resulting metric?)='%s'", km_name,
> new_ks_name, name);
> >              ks = kstat_lookup(kc, km_name, instance_num,
> new_ks_name);
> >              debug_msg("%s: Just did
> kstat_lookup().\n",new_ks_name);
> >              ++instance_num;
> >              ++num_tries;
> >              free(new_ks_name);
> >          }
> >    }
> >
> 
> 
> 
> 
> 
> --
> 
> This e-mail may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this e-mail
> in error) please notify the sender immediately and destroy this
> e-mail. Any unauthorized copying, disclosure or distribution of the
> material in this e-mail is strictly forbidden.
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> Ganglia-developers mailing list
> Ganglia-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/ganglia-developers
> 
> 


=====
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de

Reply via email to