forgot the list ...

----- Forwarded Message ----

> From: Martin Knoblauch <kn...@knobisoft.de>
> To: Cameron Spitzer <cspit...@nvidia.com>
> Sent: Wed, February 3, 2010 11:48:10 AM
> Subject: Re: [Ganglia-general] any workaround for the bogus spikes problem?
> 
> 
> >
> >From: Cameron Spitzer 
> >To: kn...@knobisoft.de
> >Cc: "ganglia-general@lists.sourceforge.net" 
> 
> >Sent: Tue, February 2, 2010 6:49:52 PM
> >Subject: Re: [Ganglia-general] any workaround for the bogus spikes problem?
> >
> >>
> >
> >  
> >
> >Martin Knoblauch wrote:
> >
> >We're trying to use Ganglia to monitor some HP DL580-G5 machines.
> >>>We're using a 64-bit linux-2.6.16.
> >>>
> >>>
> >>which version of Ganglia?
> >>
> >ganglia-3.1.2
> >
> >
> >The network traffic information is polluted with phantom 20 PB traffic 
> >>>spikes.
> >>>
> >>>
> >I tried lowering the silliness threshold from 1e13 and 1e8 to 4.0e9 and
> >3.0e6,
> >>and I cranked the collect_every on that group from 40 (seconds?) to 5.
> >>Now I get exabyte peaks instead of petabyte peaks.
> >
> >
> > what kind of NIC do you have (1GB, 10 GB)? Which hardware and driver is 
> loaded? What is the average network throughput you see?
> >>
> >>
> >It's the 1 Gbps NIC on the server motherboard, BCM5708 Rev 12.
> >>dmesg says, Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.5.5b
> >(January 31, 2007).
> >
> 
> BCM sounds familiar. Which distro are you using, which kernel?
> 
> >
> >
> >I found an ifdef for REMOVE_BOGUS_SPIKES in libmetrics/linux/metrics.c
> >>>Defining it has no effect. 
> > Maybe you can add some debugging output and check whether that stuff is 
> triggered at all. Maybe the thresholds are not good anymore.
> >>
> >Some hints about how to do that would help.  I've tried adding
> >err_msg() calls and
> >>I can't find where the messages go.  They're not in any of the syslog
> >channels.
> >>I don't understand the structure of libmetrics/linux/metrics.c well
> >enough to guess
> >>where it would make sense to open a new log file.
> >
> 
> If daemonized, messages go to syslog. If run in foreground, they go to stderr.
> 
> Just try running the gmond with "-d 1" in foreground. You should already get 
> some output in the overflow case.
> 
> >
> > And btw. that code does not *remove* bogus spikes from the RRD database. It 
> just is supposed to prevent their generation.
> >>
> >I realize that.  With each hack to libmetrics/linux/metrics.c, I've
> >been stopping gmetad and removing all the
> >>corrupted rrd files.  I don't know how to edit an rrd file.
> >
> >
> 
> The contrib directory in "trunk" has the actual "removespikes.pl" file from 
> the 
> RRD source repository. Useful for updating databases that you do not want to 
> throw away.
> 
> >  
> >>Can anyone tell me the unit of measure which applies to l_bin and l_bout 
> >>>in that file?
> >>>Is it bytes per second, bytes per collect_every, bytes per time_threshold?
> >>>
> >>>
> >> Not completely sure.
> >>
> >It would be really great if the authors of libmetrics/linux/metrics.c
> >would document it.
> >
> 
> Looking at the code, it is per second:
> 
>          /*
>          ** Compute timediff. Check for bogus delta-t
>          */
>          float t = timediff(&proc_net_dev.last_read,&stamp);
>          if ( t <  proc_net_dev.thresh) {
>            err_msg("update_ifdata(%s) - Dubious delta-t: %f",caller,t);
>            return;
>          }
>          stamp = proc_net_dev.last_read;
> 
>          /*
>          ** Compute rates in local variables
>          */
>          l_bin = l_bytes_in / t;
>          l_bout = l_bytes_out / t;
>          l_pin = l_pkts_in / t;
>          l_pout = l_pkts_out / t;
> 
> Cheers
> Martin


------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to