Nick,

On Feb 14, 2011, at 4:27 PM, <nicholas.satte...@nokia.com> wrote:

> Hi Neil,
> 
> Including the extra metrics (eg. swap in/out, contexts, interrupts) is a 
> great idea. These metrics would be very welcome by our capacity planners.
> 

Glad to hear it.   Is there is anything else you/they would like to see added 
in the future too?
(e.g. how about real-time performance stats from apache web-servers?  See 
http://mod-sflow.googlecode.com)

> Also, I'm very keen on the virtualized server metrics you get from libvirt. 
> This would make dynamic performance monitoring of cloud-like server 
> infrastructure a snap. I assume you can just spoof the hostnames of the 
> virtual server instances as they spin up.

Something like that would be OK to start with.  Lots of choices here.  Peter 
Phaal already posted some thoughts on how to model the hierarchy to the 
ganglia-users list.  Perhaps he should start the same thread on the developers 
list too?
http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg06319.html

> 
> I must admit I'm not so ken on submitting zeros for unsupported metrics. This 
> would lead to confusion ("why are the graphs there when there's no data" type 
> of questions) and also the unnecessary creation of useless RRD files. The 
> nature of RRD means that the files for all dummy metrics will be created at 
> the maximum size the instant the first value is received. This adds up to a 
> lot of wasted disk if you have 100's or 1000's of windows servers. My 
> preferred approach would be to modify the web font-end to intelligently 
> handle windows servers.

Agreed.  And it's only going to get worse with the VMs because the hypervisors 
only give us a handful of metrics for them.  That's why these "nulls" are 
omitted by default.  Hopefully it's just a temporary workaround.

> 
> Lastly, the ability to define the host sflow port should allow you to 
> configure multiple gmonds to run on the same Linux server with different 
> windows cluster names assigned to them. Was this the intention? If so, I like 
> it.
> 

Yes.  Exactly.

Neil




> Regards,
> Nick
> 
> -----Original Message-----
> From: ext Neil McKee [mailto:neil.mc...@inmon.com] 
> Sent: Saturday, February 12, 2011 7:56 AM
> To: ganglia-developers@lists.sourceforge.net
> Subject: Re: [Ganglia-developers] hsflowd for Windows + Ganglia webfrontend
> 
> Here is the patch I was referring to.  It allows you to put something like 
> this in your gmond.conf:
> 
> sflow {
>  null_int = 0
>  null_float = 0.0
> }
> 
> and then if a fields like cpu_nice is missing (as in the Windows hsflowd) 
> we'll submit 0.0 instead of leaving it out.   This is a work-around for the 
> problem where the RRD does not even appear when cpu_nice is missing.
> 
> You can also add another setting "accept_all_physical = yes",  like this:
> 
> sflow {
>  null_int = 0
>  null_float = 0
>  accept_all_physical = yes
> }
> 
> and now the extra metrics that are defined in host-sflow but not in 
> libmetrics are accepted too.  These include some useful ones like the number 
> of context switches,  the number of pages swapped in/out, network errors and 
> drops, more info on disk reads and writes,  and so on.  The UI seems to do a 
> good job of just adding these RRDs to the page (so perhaps it would be even 
> safe to make "yes" the default here?)
> 
> I'm still skipping over the VM fields,  and don't have the option to ignore 
> the sFlow hostname field yet,  but placeholder boolean options 
> "accept_all_virtual" and "accept_hostname" are defined.  There is also 
> "udp_port" in case you want to designate a non-standard port as the sFlow 
> port (though it still has to appear in a udp_receive_channel section 
> elsewhere).
> 
> I didn't edit gmond/conf.pod yet.  I figured that could happen once there is 
> consensus on these options.
> 
> Thoughts?
> 
> Regards,
> Neil
> 


------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to