All, To progress on this problem, first I would like to know what'll be the status in the next release of ganglia of the "scalable" mode available in gmetad. Because the comment in the configuration /etc/gmetad.conf speak about a "backwards compatibility", maybe this mode is not intended to by maintained any more in the futur ? Is someone able to reply ? Best Regards. Christian.
----- Original Message ----- From: "Brad Anderson" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Monday, March 10, 2008 11:28 PM Subject: Re: [Ganglia-general] dual gmetad setup > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Christian, > > Thanks! That looks like it worked. I rolled my gmetad_node2 back > to gmetad version 3.0.1-1 and it can link properly with gmetad_node1 > with the scalable option turned off on both nodes. > > Regards, > Brad Anderson > > Christian Gouret wrote: >> Hi Brad, I face some mouths ago a quit same problem. To work >> arround it, I use a gmetad_node2 in version 3.0.1. >> >> Hereafter the stack of gmetad at failure time ( 3.0.4 ) in my >> environment : >> >> Program received signal SIGSEGV, Segmentation fault. [Switching to >> Thread 131081 (LWP 12739)] *__GI___pthread_mutex_unlock (mutex=0x0) >> at mutex.c:178 178 mutex.c: No such file or directory. in >> mutex.c (gdb) where #0 *__GI___pthread_mutex_unlock (mutex=0x0) at >> mutex.c:178 #1 0x0804e1e0 in endElement_CLUSTER () #2 0x0804e2ee >> in end () #3 0x0805a26e in doContent () #4 0x08059319 in >> contentProcessor () #5 0x0805c6ba in doProlog () #6 0x0805c063 in >> prologProcessor () #7 0x0805bfe9 in prologInitProcessor () #8 >> 0x08058d4d in XML_ParseBuffer () #9 0x08058cb5 in XML_Parse () #10 >> 0x0804e3d0 in process_xml () #11 0x0804b341 in data_thread () #12 >> 0x40085c80 in pthread_start_thread (arg=0x4121dbe0) at >> manager.c:301 #13 0x40085d82 in pthread_start_thread_event >> (arg=0x4121dbe0) at manager.c:324 #14 0x401b9f87 in clone () at >> ../sysdeps/unix/sysv/linux/i386/clone.S:100 (gdb) (gdb) print >> *xmldata $12 = {rval = 134700213, old = 2, sourcename = 0x8075c7f >> "", hostname = 0x0, ds = 0x8075cbd, grid_depth = 6, host_alive = >> 134700224, source = {id = 29, report_start = 0x8075cc4 >> <_IO_stdin_used+8768>, report_end = 0x4, authority = 0x8075cc9, >> authority_ptr = 20, metric_summary = 0x8075c7f, sum_finished = 0x0, >> ds = 0x8075c7f, hosts_up = 0, hosts_down = 134700236, localtime = >> 21, owner = 23679, latlong = 2055, url = 0, stringslen = 0, >> >> >> source = &xmldata->source; summary = >> xmldata->source.metric_summary; >> >> /* Release the partial sum mutex */ >> pthread_mutex_unlock(source->sum_finished); /*err_msg("%s releasing >> lock", xmldata->sourcename);*/ >> >> Best Regards. Christian. >> >> ----- Original Message ----- From: "Brad Anderson" >> <[EMAIL PROTECTED]> To: >> <[email protected]> Sent: Thursday, March 06, >> 2008 8:03 PM Subject: [Ganglia-general] dual gmetad setup >> >> >> All, >> >> I am having issues getting a dual gmetad env up and running. Here >> is the problem. I have one gmetad node (gmetad_node1) checking a >> single cluster of 1 machine. This node works fine, rrds are being >> created and when I place a UI ontop of it all is well. The trouble >> I am having is with my second gmetad node (gmetad_node2). I want >> this node to pull all its data from gmetad_node1 and store a copy >> of all rrds on its file system as well. I have turned off the >> "scalabe" option in gmetad.conf , and it starts to collect the >> first round of data but dies shortly after writing rrds. I have >> included a log of gmetad_node2 start up with debug at 10. >> >> any help on this issue would be appreciated. >> >> Regards, Brad Anderson >> >> >> gmetad_node1: - CentOS 4.4 - ganglia-gmetad-3.0.6-1 - >> ganglia-web-3.0.6-1 - monitoring a single cluster of 1 machine - >> writes rrds localy to disk >> >> >> gmetad_node2: - CentOS 4.4 - ganglia-gmetad-3.0.6-1 - >> ganglia-web-3.0.6-1 - scalable off - single data_source of >> gmetad_node1 >> >> >> gmetad_node2 startup debug log: /etc/init.d/gmetad restart Shutting >> down GANGLIA gmetad: [FAILED] Starting >> GANGLIA gmetad: Going to run as user nobody Sources are ... Source: >> [grid1, step 30] has 1 sources 10.0.0.1 xml listening on port 8651 >> interactive xml listening on port 8652 Data thread -1271247952 is >> monitoring [grid1] data source 10.0.0.1 cleanup thread has been >> started [grid1] is a 2.5 or later data stream hash_create size = >> 1024 hash->size is 1031 hash_create size = 50 hash->size is 53 >> hash_create size = 50 hash->size is 53 Updating host >> host1.domain.com, metric disk_free Updating host host1.domain.com, >> metric bytes_out Updating host host1.domain.com, metric proc_total >> Updating host host1.domain.com, metric pkts_in Updating host >> host1.domain.com, metric cpu_nice Updating host host1.domain.com, >> metric cpu_speed Updating host host1.domain.com, metric boottime >> Updating host host1.domain.com, metric >> qmail_msgs_to_be_preprocessed Updating host host1.domain.com, >> metric cpu_wio Updating host host1.domain.com, metric >> qmail_msgs_in_queue Updating host host1.domain.com, metric load_one >> Updating host host1.domain.com, metric disk_total Updating host >> host1.domain.com, metric cpu_idle Updating host host1.domain.com, >> metric cpu_user Updating host host1.domain.com, metric swap_free >> Updating host host1.domain.com, metric mem_cached Updating host >> host1.domain.com, metric pkts_out Updating host host1.domain.com, >> metric load_five Updating host host1.domain.com, metric cpu_num >> Updating host host1.domain.com, metric load_fifteen Updating host >> host1.domain.com, metric mem_free Updating host host1.domain.com, >> metric cpu_system Updating host host1.domain.com, metric proc_run >> Updating host host1.domain.com, metric mem_total Updating host >> host1.domain.com, metric cpu_aidle Updating host host1.domain.com, >> metric bytes_in Updating host host1.domain.com, metric mem_buffers >> Updating host host1.domain.com, metric mem_shared Updating host >> host1.domain.com, metric swap_total Updating host host1.domain.com, >> metric part_max_used Writing Summary data for source Servers, >> metric disk_free Writing Summary data for source Servers, metric >> bytes_out Writing Summary data for source Servers, metric >> proc_total Writing Summary data for source Servers, metric cpu_nice >> Writing Summary data for source Servers, metric pkts_in Writing >> Summary data for source Servers, metric cpu_speed Writing Summary >> data for source Servers, metric boottime Writing Summary data for >> source Servers, metric qmail_msgs_to_be_preprocessed Writing >> Summary data for source Servers, metric cpu_wio Writing Summary >> data for source Servers, metric qmail_msgs_in_queue Writing Summary >> data for source Servers, metric load_one Writing Summary data for >> source Servers, metric disk_total Writing Summary data for source >> Servers, metric cpu_user Writing Summary data for source Servers, >> metric cpu_idle Writing Summary data for source Servers, metric >> swap_free Writing Summary data for source Servers, metric pkts_out >> Writing Summary data for source Servers, metric mem_cached Writing >> Summary data for source Servers, metric load_five Writing Summary >> data for source Servers, metric cpu_num Writing Summary data for >> source Servers, metric load_fifteen Writing Summary data for source >> Servers, metric mem_free Writing Summary data for source Servers, >> metric cpu_system Writing Summary data for source Servers, metric >> proc_run Writing Summary data for source Servers, metric mem_total >> Writing Summary data for source Servers, metric cpu_aidle Writing >> Summary data for source Servers, metric bytes_in Writing Summary >> data for source Servers, metric mem_buffers Writing Summary data >> for source Servers, metric mem_shared Writing Summary data for >> source Servers, metric swap_total Writing Summary data for source >> Servers, metric part_max_used [FAILED] >>> >>> > - ------------------------------------------------------------------------- >>> > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Ganglia-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/ganglia-general >>> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFH1bYOqOVHpERMGj0RAk4yAKCEX4PBBmlRCqcYC53gfFF8TSAt8wCdFn5v > s5lCuO1SfObVtEG52/6Wg7Y= > =IXLn > -----END PGP SIGNATURE----- > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Ganglia-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/ganglia-general > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

