You log says you are using multicast mode. Could you please share your gmetad 
(ganglia client) configuration. I would also suggest you to look into the 
network connectivity, are you able to telnet on port 8649 from "insight02" to 
"insight01". 


Thanks,
Mozammil 


________________________________
 From: Nicholas Pilkington <nicholas.pilking...@gmail.com>
To: Mohd Mozammil khan <moz_r...@yahoo.com> 
Cc: "ganglia-general@lists.sourceforge.net" 
<ganglia-general@lists.sourceforge.net> 
Sent: Monday, 25 June 2012 2:55 PM
Subject: Re: [Ganglia-general] What is the easiest way to debug ganglia?
 

I'm not too sure, I think multicast, is there a way to tell from the debug 
ouput. Here is a snippet of the debug spew, can't see any errors (output is 
attached, "insight01" is the localhost, "insight02" is running the webfrontend)

loaded module: cpu_module
loaded module: disk_module
loaded module: load_module
loaded module: mem_module
loaded module: net_module
loaded module: proc_module
loaded module: sys_module
udp_recv_channel mcast_join=239.2.11.71 mcast_if=NULL port=8649 bind=239.2.11.71
tcp_accept_channel bind=NULL port=8649
udp_send_channel mcast_join=239.2.11.71 mcast_if=NULL host=NULL port=8649

metric 'cpu_user' being collected now
metric 'cpu_user' has value_threshold 1.000000
metric 'cpu_system' being collected now
metric 'cpu_system' has value_threshold 1.000000
metric 'cpu_idle' being collected now
metric 'cpu_idle' has value_threshold 5.000000
metric 'cpu_nice' being collected now
metric 'cpu_nice' has value_threshold 1.000000
metric 'cpu_aidle' being collected now
metric 'cpu_aidle' has value_threshold 5.000000
metric 'cpu_wio' being collected now
metric 'cpu_wio' has value_threshold 1.000000
metric 'load_one' being collected now
metric 'load_one' has value_threshold 1.000000
metric 'load_five' being collected now
metric 'load_five' has value_threshold 1.000000
metric 'load_fifteen' being collected now
metric 'load_fifteen' has value_threshold 1.000000
metric 'proc_run' being collected now
metric 'proc_run' has value_threshold 1.000000
metric 'proc_total' being collected now
metric 'proc_total' has value_threshold 1.000000
metric 'mem_free' being collected now
metric 'mem_free' has value_threshold 1024.000000
metric 'mem_shared' being collected now
metric 'mem_shared' has value_threshold 1024.000000
metric 'mem_buffers' being collected now
metric 'mem_buffers' has value_threshold 1024.000000
metric 'mem_cached' being collected now
metric 'mem_cached' has value_threshold 1024.000000
metric 'swap_free' being collected now
metric 'swap_free' has value_threshold 1024.000000
metric 'bytes_out' being collected now
 ********** bytes_out:  10004.246094
metric 'bytes_out' has value_threshold 4096.000000
metric 'bytes_in' being collected now
 ********** bytes_in:  9135.960938
metric 'bytes_in' has value_threshold 4096.000000
metric 'pkts_in' being collected now
 ********** pkts_in:  7.841944
metric 'pkts_in' has value_threshold 256.000000
metric 'pkts_out' being collected now
 ********** pkts_out:  8.314949
metric 'pkts_out' has value_threshold 256.000000
metric 'disk_total' being collected now
Counting device /dev/mapper/insight01-root (81.00 %)
Counting device /dev/mapper/insight01-data (0.01 %)
Counting device /dev/sdc1 (30.05 %)
Counting device /dev/sdd1 (29.57 %)
Counting device /dev/sde1 (29.78 %)
Counting device /dev/sdf1 (29.51 %)
For all disks: 8337.525 GB total, 5927.558 GB free for users.
metric 'disk_total' has value_threshold 1.000000
metric 'disk_free' being collected now
Counting device /dev/mapper/insight01-root (81.00 %)
Counting device /dev/mapper/insight01-data (0.01 %)
Counting device /dev/sdc1 (30.05 %)
Counting device /dev/sdd1 (29.57 %)
Counting device /dev/sde1 (29.78 %)
Counting device /dev/sdf1 (29.51 %)
For all disks: 8337.525 GB total, 5927.558 GB free for users.
metric 'disk_free' has value_threshold 1.000000
metric 'part_max_used' being collected now
Counting device /dev/mapper/insight01-root (81.00 %)
Counting device /dev/mapper/insight01-data (0.01 %)
Counting device /dev/sdc1 (30.05 %)
Counting device /dev/sdd1 (29.57 %)
Counting device /dev/sde1 (29.78 %)
Counting device /dev/sdf1 (29.51 %)
For all disks: 8337.525 GB total, 5927.558 GB free for users.
metric 'part_max_used' has value_threshold 1.000000
sending metadata for metric: heartbeat
sent message 'heartbeat' of length 52 with 0 errors
sending metadata for metric: cpu_num
sent message 'cpu_num' of length 48 with 0 errors
sending metadata for metric: cpu_speed
sent message 'cpu_speed' of length 52 with 0 errors
sending metadata for metric: mem_total
sent message 'mem_total' of length 52 with 0 errors
sending metadata for metric: swap_total
sent message 'swap_total' of length 52 with 0 errors
sending metadata for metric: boottime
sent message 'boottime' of length 48 with 0 errors
sending metadata for metric: machine_type
sent message 'machine_type' of length 60 with 0 errors
sending metadata for metric: os_name
sent message 'os_name' of length 56 with 0 errors
sending metadata for metric: os_release
sent message 'os_release' of length 68 with 0 errors
sending metadata for metric: location
sent message 'location' of length 60 with 0 errors
sending metadata for metric: gexec
sent message 'gexec' of length 52 with 0 errors
sending metadata for metric: cpu_user
sent message 'cpu_user' of length 48 with 0 errors
sending metadata for metric: cpu_system
sent message 'cpu_system' of length 52 with 0 errors
sending metadata for metric: cpu_idle
sent message 'cpu_idle' of length 48 with 0 errors
sending metadata for metric: cpu_nice 

>
>
>
>
>On Mon, Jun 25, 2012 at 10:14 AM, Mohd Mozammil khan <moz_r...@yahoo.com> 
>wrote:
>
>Are you using multicast or unicast? The best way to debug Ganglia (gmond) is 
>to run it in debug mode. This is enabled by setting up debug_level = 10 in 
>gmond.conf file, you have to restart the daemon after making changes. If you 
>don't find any exception while running in debug then you have some issue with 
>your cluster. The cluster issue only arises if you are using unicast mode. To 
>deal with this situation, restart unicast master node and then restart all 
>other slave nodes.
>>
>>
>>
>>
>>
>>
>>Cheers,
>>Mozammil 
>>
>>
>>
>>
>>________________________________
>> From: Nicholas Pilkington <nicholas.pilking...@gmail.com>
>>To: ganglia-general@lists.sourceforge.net 
>>Sent: Monday, 25 June 2012 2:20 PM
>>Subject: [Ganglia-general] What is the easiest way to debug ganglia?
>> 
>>
>>
>>Bit of a newbie question, sorry:
>>
>>
>>I have 6 machines running Ganglia. 5 are sending information to the 
>>web-frontend perfectly, and one is not. What is the easiest way to find out 
>>where the problem lies?
>>
>>
>>Nick
>>
>>
>>
>>------------------------------------------------------------------------------
>>Live Security Virtual Conference
>>Exclusive live event will cover all the ways today's security and 
>>threat landscape has changed and how IT managers can respond. Discussions 
>>will include endpoint security, mobile security and the latest in malware 
>>threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>_______________________________________________
>>Ganglia-general mailing list
>>Ganglia-general@lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/ganglia-general
>>
>>
>>
>
>
>
>
>-- 
>Nicholas C.V. Pilkington
>University of Cambridge
>


-- 
Nicholas C.V. Pilkington
University of Cambridge
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to