------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de



----- Original Message ----
> From: "Escobio, Roger " <[EMAIL PROTECTED]>
> To: ganglia-general@lists.sourceforge.net
> Sent: Wednesday, September 10, 2008 2:40:27 PM
> Subject: Re: [Ganglia-general] Anyone experience petabyte peaks in network 
> metric in ganglia 3.x.y ?
> 
> 
> > -----Original Message-----
> > From: Martin Knoblauch [mailto:[EMAIL PROTECTED] 
> > Sent: September 10, 2008 6:55 AM
> > To: Witham, Timothy D; Escobio, Roger [CMB-IT]; 
> > ganglia-general@lists.sourceforge.net
> > Subject: Re: [Ganglia-general] Anyone experience petabyte 
> > peaks in network metric in ganglia 3.x.y ?
> > 
> > ----- Original Message ----
> > 
> > > From: "Witham, Timothy D" 
> > > To: "Escobio, Roger " ; 
> > "ganglia-general@lists.sourceforge.net" 
> > 
> > > Sent: Tuesday, September 9, 2008 9:42:34 PM
> > > Subject: Re: [Ganglia-general] Anyone experience petabyte 
> > peaks in network metric in ganglia 3.x.y ?
> > > 
> > > >I am testing ganglia in a cluster of linux but we are getting this
> > > >confusing peaks in the bytes/s and in the packets/s (image 
> > attached)
> > > 
> > > I have been able to minimize this significantly by using 
> > code from svn trunk and 
> > > building with
> > > 
> > >         make CPPFLAGS=-DREMOVE_BOGUS_SPIKES
> > > 
> > > IMHO, that should be the default.
> > > 
> > Hi Tim,
> > 
> >  the problem is that with NICs faster than 1000 Mbit, the 
> > naturally occuring wrap-arounds will come too frequently 
> > (especially for the byte counters) and will trigger the 
> > remove mechanism and really mess up the data. The better 
> > solution would be to bring the networking counters in the 
> > Linux kernel to 64-bit (they are 32-bit right now). Then we 
> > would not have to care about natural wrap-around for a few 
> > years. I once proposed this change, but it was not greeted 
> > with much enthusiasm :-(
> > 
> >  Therefore I #ifdef-ed my check. Especailly as the effect 
> > seems to be really a very NIC specific bug.
> > 
> >  Escobio -> what NICs are in the systems in question (all the 
> > same?). As I undertand, you are using some 2.6.9 kernel?
> > 
> You right, we have been seeing this random peaks in HP servers with:
> 
> Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet
> Broadcom Corporation NetXtreme II BCM5706 Gigabit Ethernet
>
Hi Escobio,

 I observed the problem on:

 2.6.9-42.ELsmp

 and "BCM5708 Gigabit Ethernet (rev 11)" NICs with the "bnx2" drivers. The 
problem is some weird bug when DMAing the counters. Solved in the 2.6.17 
timeframe IIRC. The fix might even have been backported to RHEL4Ux, where x > 4.

> Running 2.6.9 (redhat kernel :-) )
> Kernel 2.4.9 do not seeing affect, right?
>

 Not sure whether those NICs were supported in the stone age :-)
 
> How good is to have a maxvalue for bytes/s in the definition of the
> metrics? So if the counter's diff give more than that just discard that
> read
> 
> I know that that will not solve the packets/s peak but it could be a
> safe check before add the values to stat
> 
> I created a patch again linux/metrics.c (3.1.1 version) to add the
> counterdiff function found in *bsd/metrics.c 
> Are you interested in it? Just let me know and I'll send it to the list
> 

 Yes please. I am definitely like to have a look at your patch.

Cheers
Martin

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to