Martin Knoblauch wrote:
We're trying to use Ganglia to monitor some HP DL580-G5 machines.
We're using a 64-bit linux-2.6.16.

    

which version of Ganglia?
  
ganglia-3.1.2

The network traffic information is polluted with phantom 20 PB traffic 
spikes.

    
I tried lowering the silliness threshold from 1e13 and 1e8 to 4.0e9 and 3.0e6,
and I cranked the collect_every on that group from 40 (seconds?) to 5.
Now I get exabyte peaks instead of petabyte peaks.


  

 what kind of NIC do you have (1GB, 10 GB)? Which hardware and driver is loaded? What is the average network throughput you see?

  
It's the 1 Gbps NIC on the server motherboard, BCM5708 Rev 12.
dmesg says, Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.5.5b (January 31, 2007).

  
I found an ifdef for REMOVE_BOGUS_SPIKES in libmetrics/linux/metrics.c
Defining it has no effect. 
 Maybe you can add some debugging output and check whether that stuff is triggered at all. Maybe the thresholds are not good anymore.
  
Some hints about how to do that would help.  I've tried adding err_msg() calls and
I can't find where the messages go.  They're not in any of the syslog channels.
I don't understand the structure of libmetrics/linux/metrics.c well enough to guess
where it would make sense to open a new log file.

 And btw. that code does not *remove* bogus spikes from the RRD database. It just is supposed to prevent their generation.
  
I realize that.  With each hack to libmetrics/linux/metrics.c, I've been stopping gmetad and removing all the
corrupted rrd files.  I don't know how to edit an rrd file.

  
Can anyone tell me the unit of measure which applies to l_bin and l_bout 
in that file?
Is it bytes per second, bytes per collect_every, bytes per time_threshold?

    

 Not completely sure.
  
It would be really great if the authors of libmetrics/linux/metrics.c would document it.

-Cameron


This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to