Hi Cameron, [adding the developers list]

 OK:

1) we write the unmodified data in line 233 to capture the "raw" counters. That 
is what we are using in line 227 for the comparison
2) "ns" is created and returned by "hash_lookup"
3) The ULONG_MAX logic in line 231 is there because we need to ensure that the 
result is always positive. Needed because the variables are unsigned.
4) "update_ifdata" is called once by "metric_init" and then every time one of 
the byte/pkts_in/out collectors fires

 Now this does not solve your problem ... Question: do you see any of the debug 
messages that should be created by "update_ifdata" in case of something 
unusual? 
That should help to get an idea on how the interface counters on your 
machine(s) 
look like. Lokk in "/var/log/messages", or just start "gmond" noninteractive.

 Hmm. Another question: do you compile "gmond" in 64-bit or 32-bit mode? The 
ULONG_MAX logic may/will fail in 32-bit mode, if the kernel is 64-bit. It could 
even be that the interface counters on 32-bit kernels are written as 64-bit 
values.

Hope this helps

Martin 
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de


>
>From: Cameron L. Spitzer <cspit...@nvidia.com>
>To: "ganglia-gene...@lists.sourceforge.net" 
><ganglia-gene...@lists.sourceforge.net>
>Sent: Thu, April 28, 2011 3:21:04 AM
>Subject: [Ganglia-general] revisiting bogus spikes
>
>
>Once again I've been asked to make Ganglia usable on Linux hosts with the 
>Broadcom NIC with the 32-bit byte counters.
>E.g., HP "Proliant" 580 G5, a rather popular machine where Ganglia doesn't 
>work 
>out of the box.
>
>So I'm trying to understand ganglia-3.1.7/libmetrics/linux/metrics.c again.
>
>In update_ifdata(), we parse /proc/net/dev for the current bytes and packets 
>in 
>and out.
>There's a structure "ns" (declared where?) of type net_dev_stats, representing 
>the previous sample?
>I'm not sure exactly what "ns" represents.
>
>There's a sanity check at line 227   "if ( rbi >= ns->rbi )"  for whether the 
>counter went up or down.  If it went down, we assume the counter rolled 
>around, 
>and guess the value is negative, and invert it, line 231. " l_bytes_in += 
>ULONG_MAX - ns->rbi + rbi;"
>(I don't understand how that is supposed to work.)
>Then, regardless of whether the sample passed or failed the sanity check, it's 
>saved in the "ns" structure.
>Line 233, "ns->rpi = rpi;"
>
>After the parsing is all done, and the crazy value is in "ns", an optional 
>reasonableness test (REMOVE_BOGUS_SPIKES)
>returns early if any of the numbers are extremely large.  Otherwise it updates 
>the static running counts and then returns.
>On our HP 580G5s, defining REMOVE_BOGUS_SPIKES had no effect.  The network 
>traffic graphs become useless within a minute of starting gmond.
>
>The part I don't understand is when the line 227 check fails, we put the 
>known-bad data in "ns" anyway.
>
>I'd appreciate it if someone familiar with update_ifdata() could explain its 
>logic.  When is this routine called?
>(I can see modules/network/mod_net.c calls it via bytes_in_func(), but I 
>haven't 
>figured out when net_metric_handler()
>is called.  Maybe that would explain how bogus data in "ns" doesn't matter.)
>Is there any way to keep way out-of-scale data out of these graphs?
>Thanks for any help.
>
>-Cameron in Los Gatos
>
>
>
>
>
________________________________
 
>This email message is for the sole use of the intended recipient(s) and may  
>contain confidential information.  Any unauthorized review, use, disclosure  
>or 
>distribution is prohibited.  If you are not the intended recipient,  please 
>contact the sender by reply email and destroy all copies of the original  
>message. 
>
>
________________________________
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to