Michael, could you please post a "diff -u" style patch, so that I can apply it automatically (alternatively just send me your version of solaris.c).
Robert: could you have a look at the change for sanity/correctness? Apparently your previous fix did not take into account the case that CPU #0 is missing. Unfortunatelly I have no box to test those cases out. Cheers Martin --- Michael Hom <[EMAIL PROTECTED]> wrote: > OS: Solaris 5.8, 5.9 > File: "tarball/gmond/machines/solaris.c" > Function: get_kstat_val( ) > > I've been testing the latest ganglia 2.6.0 posted on March 22nd. > I noticed that gmond segfaults & core dumps when the first online cpu > is NOT in slot #0. > When I do a "kstat cpu_stat", my cpus are instance #'s 1 & 3. > The crash occurs during the ks = kstat_lookup( ). It's hardcoded to > look for cpu instance #0. > > Anyways, I wrote the following to check the next cpu instance if the > default (zero) returns NULL. > So since most machines will likely have a cpu at slot #0, I left the > initial check as is. > However, if it fails, then start checking instance #1, #2,... etc > until one is found. > It's not pretty, but it appears to work. > Here's a diff of my changes for solaris.c: > > 4c4 > < #include "debug_msg.h" > --- > > #include "lib/debug_msg.h" > > 229a230,250 > > > > /* CPU_INFO module & instance check > > * If the first online CPU is not in slot #0, gmond will > segfault and core dump. > > * If ks == NULL, then traverse the cpu instance #'s until the > first online cpu is found. > > * After the traversal, if ks == NULL, then something is very > wrong with the cpu configuration. > > */ > > if ((km_name == "cpu_info") && (ks == NULL)) { > > int instance_num = 1; > > int num_tries = 0; > > while ((num_tries < (int)metriclist.cpu_num.uint32) && (ks > == NULL)) { > > char *new_ks_name = (char *)malloc(20*sizeof(char)); > > sprintf(new_ks_name, "cpu_info%d", instance_num); > > debug_msg( "Lookup up kstat: km (unix?)='%s', ks > (system_misc?)='%s',kn (resulting metric?)='%s'", km_name, > new_ks_name, name); > > ks = kstat_lookup(kc, km_name, instance_num, > new_ks_name); > > debug_msg("%s: Just did > kstat_lookup().\n",new_ks_name); > > ++instance_num; > > ++num_tries; > > free(new_ks_name); > > } > > } > > > > > > > > -- > > This e-mail may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this e-mail > in error) please notify the sender immediately and destroy this > e-mail. Any unauthorized copying, disclosure or distribution of the > material in this e-mail is strictly forbidden. > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IBM Linux Tutorials > Free Linux tutorial presented by Daniel Robbins, President and CEO of > GenToo technologies. Learn everything from fundamentals to system > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > _______________________________________________ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > ===== ------------------------------------------------------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de