Re: [Ganglia-developers] Linux/cygwin gmond metric poll rate.

Richard.Grevis Fri, 28 Jul 2006 07:11:48 -0700

Martin,

thanks for the reply. It would make sense that perhaps the code is
a legacy of before when collection_groups were implemented (aside:
collection groups
are smart and obviously necessary, but did I realise this before seeing
Ganglia? Nope...)


Setting the intervals to 0 seems to work fine. e.g from an strace -e
trace=open:
14:41:52 open("/proc/stat", O_RDONLY)   = 8
14:41:52 open("/proc/loadavg", O_RDONLY) = 8
14:41:52 open("/proc/meminfo", O_RDONLY) = 8
14:41:52 open("/proc/net/dev", O_RDONLY) = 8
14:41:57 open("/proc/stat", O_RDONLY)   = 8
14:41:57 open("/proc/loadavg", O_RDONLY) = 8
14:41:57 open("/proc/meminfo", O_RDONLY) = 8
14:41:57 open("/proc/net/dev", O_RDONLY) = 8

But as the update_file code has this:
if(now - tf->last_read > tf->thresh)
it means that opens of some /proc file won't happen faster than once in
any second
even though I set the minimum delay to 0.

It all works fine for me.

I was wrong about the poll delay and the above delays adding up. They
don't.

As for wanting a 5 second polling rate. Sigh. Its what they want. Also
our
job mix in this investment bank is a mixture of risk calculations they
take a while,
and pricing calculations that sometimes need to be delivered in seconds.
They want
to see the load spike for those calculations.

kind regards,
Richard



-----Original Message-----
From: Martin Knoblauch [mailto:[EMAIL PROTECTED] 
Sent: 28 July 2006 14:19
To: Grevis, Richard: IT (LDN); ganglia-developers@lists.sourceforge.net
Subject: Re: [Ganglia-developers] Linux/cygwin gmond metric poll rate.


Hi Richard,

--- [EMAIL PROTECTED] wrote:

> Guys,
> 
> 
> the code below is in the cygwin and linux metric.c files.
> 
> --------------------------------------------------------
> typedef struct {
>   uint32_t last_read;
>   uint32_t thresh;
>   char *name;
>   char buffer[BUFFSIZE];
> } timely_file;
> 
> timely_file proc_stat    = { 0, 15, "/proc/stat" };
> timely_file proc_loadavg = { 0, 15, "/proc/loadavg" }; timely_file 
> proc_meminfo = { 0, 30, "/proc/meminfo" }; timely_file proc_net_dev = 
> { 0, 30, "/proc/net/dev" };
> 
> char *update_file(timely_file *tf)
> {
>   int now,rval;
>   now = time(0);
>   if(now - tf->last_read > tf->thresh) {
>     rval = slurpfile(tf->name, tf->buffer, BUFFSIZE);
>     if(rval == SYNAPSE_FAILURE) {
>       err_msg("update_file() got an error from slurpfile() reading 
> %s",
>               tf->name);
>       return (char *)SYNAPSE_FAILURE;
>     }
>     else tf->last_read = now;
>   }
>   return tf->buffer;
> }
> --------------------------------------------------------
> 
> In my ganglia setup I have a small number of metrics polled often 
> (5-10
> seconds) on a large number
> of host (4,000!). I have already observed that values did not seem to 
> change that fast, but have only now found the above code. I obviously 
> can't poll fast with
> the above code.
>

 let me understand, you want to sample (e.g.) cpu_* at 5 sc intervalls?

> Given that metrics are measured only at the rate defined by their poll
> time, what is the point
> of the above code? Is it to ensure that when (say) cpu_user, cpu_sys,
> cpu_wio etc are measured, the
> /proc file is only opened once?
>

 You might be right. Before we had "collection groups" I guess this
would [kind of] ensure that all the cpu_* stats would be taken from the
same sample. Otherwise they would not add up to 100%.

 Also it rate limits the reading of the /proc files, which might be
considered to be "expensive".

 What happens when you compile with smaller values?

> Also it seems the delay between /proc file reads is the delay above, 
> plus the poll delay of the metric. It this true? I measured this, but 
> did not trace through the code source
> to confirm.
>

 Not sure about that. If true is sounds wrong.
 
> So Matt (or a similar guru), can you let me know the intent of this 
> code?
> 
> kind regards,
> Richard G


------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
------------------------------------------------------------------------
For more information about Barclays Capital, please visit our web site at 
http://www.barcap.com.

Internet communications are not secure and therefore the Barclays Group does 
not accept legal responsibility for the contents of this message.  Although the 
Barclays Group operates anti-virus programmes, it does not accept 
responsibility for any damage whatsoever that is caused by viruses being 
passed.  Any views or opinions presented are solely those of the author and do 
not necessarily represent those of the Barclays Group.  Replies to this email 
may be monitored by the Barclays Group for operational or business reasons.
------------------------------------------------------------------------

Re: [Ganglia-developers] Linux/cygwin gmond metric poll rate.

Reply via email to