Adrian,

Have you experimented with Host sFlow agents?

http://host-sflow.sourceforge.net/

The Host sFlow agents export the following standard metrics (which are based on 
the Ganglia libmetrics core, adding additional metrics for disk I/O, UUIDs, 
virtual machines etc.):

http://sflow.org/sflow_host.txt

You can set the UUID when you run the Host sFlow daemon if you aren’t happy 
with the UUID it gets from the operating system.

Gmond in Ganglia 3.2+ natively understands the sFlow metrics and can be used to 
collect the metrics from multiple machines and pass them on (you can disable 
collection of localhost data in gmond - just don’t load any modules). There is 
a chapter on configuring sFlow in the O’Reilly book.

There are tradeoffs with using sFlow as the metrics transport protocol. The 
Host sFlow agent is portable and lightweight (there are implementations for 
Windows, AIX, Solaris, Linux, hypervisors etc.) and for large scale 
deployments, and sFlow reduces the network traffic associated with sending 
metrics by a factor of 50 or more (sFlow packs all the metrics in a single 
datagram, gmond sends a single metric per datagram). However, the Host sFlow 
agent achieves these benefits by only exporting a standard set of metrics. The 
sFlow protocol doesn’t support export of custom name / value metrics and so you 
need to use gmetric / gmetric.py to send additional metrics if the standard 
core set isn’t sufficient for your needs (however, there are standard sFlow 
extensions that cover common applications like web servers, Java etc that you 
may find useful).

http://host-sflow.sourceforge.net/relatedlinks.php

At the front end there is no difference in how you set up gmetad and the UI; 
gmond handles the conversion of metrics from sFlow format.

Peter

On Dec 6, 2013, at 11:35 PM, Adrian Sevcenco <adrian.sevce...@cern.ch> wrote:

> On 12/06/2013 10:51 PM, Devon H. O'Dell wrote:
>> 2013/12/6 Vladimir Vuksan <vli...@veus.hr>:
>>> Hello everyone,
> Hi!
> 
>>> For few weeks now we have had performance issues due to growth of
>>> our monitoring setup. One of my colleagues Devon O'Dell volunteered
>>> to help and below is an e-mail of his findings.
>> 
>> Hi! I joined the ML, so I'm around to answer questions. Nice to
>> 'meet' you guys!
> Thank you for your work! I have also some questions/ideas also but i am
> still struggling with the internal gmond structures so it may take a
> while until i can contribute also myself (plus i am not a programmer by
> profession)
> 
> 
> So:
> You said that you are using a gmond to collect data from every machine.
> The problem with the current implementation of gmond is that:
> 1. cannot be used for aggregation only (no metrics from localhost)
> 2. the cluster tagging is done at xml reporting level not at host level.
> 
> It would be nice to have possibility to have gmond aggregators that just
> pass along a collection of metrics from multiple machines.
> Also if the cluster tagging would be made at gmond reporting level it
> would be possible to aggregate in an gmond metrics from different
> clusters and gmetad would just write each metrics bundle in the
> corresponding cluster space.
> 
> Moreover (it was discussed on the list without a clear conclusion) it
> would be great if there can be introduced in gmond an UUID ID (without
> regard of method of generation: from hardware or random generated)
> that would be the actual key for identifying a machine.
> It would be enough to have in gmond.conf in host section something like:
> uuid = "some_uuid"
> and
> move override_hostname from globals to host in a form of an list
> override_hostname_list="list_of names"
> that would be reported to gmetad as a list of aliases (alongside the
> reverse DNS result)
> This will have the effect that the host be be search also by any of
> former or present hostnames (resolved of not by DNS)
> 
>> Ganglia performance, but most of the low hanging fruit is now gone; at
>> some me point it will require:
>> 
>> * writing a version of librrd (this probably also means changing the
>> rrd file format),
> We (ALICE experiment from CERN) use an tool named MonaLisa
> (http://monalisa.caltech.edu) written in java that can take in many
> hundredths of thousands of metrics and written them in postgres database.
> One obvious advantage would be that there is no need of summarizing at
> recording stage and also that you have access to the precise metrics
> without losing information because of averaging.
> 
> Wouldn't be possible to adapt the gmetad to write the data in a postgres
> database? One side effect would be that gweb can easily be on other
> server (for security and load separation purposes) and make reports from
> the database (also with the averaging mechanism implemented at reporting
> level)
> 
>> * replacing the hash table in Ganglia with one that performs better,
>> * changing the data serialization format from XML to one that is
>> easier faster to parse,
> i could just speaking nonsense as i dont understand exactly where is the
> hash table is used (at the metrics collection step by gmond or gmetad?)
> but couldn't be used for all communication the same xdr format (and
> maybe the communication can be improved by using zeromq?)
> (also with some standalone cli tool that would read and process the
> output of an gmond). This would remove the need of an xml output, and
> with the cli tool also would be the possibility of text human inspection
> of the metrics. (eventually with the conversion to xml done by cli tool)
> 
>> * using a different data structure than a hash table for metrics
>> hierarchies (probably a tree with metrics stored at each level in
>> contiguous memory and an index describing each metric at each level)
> postgres tables?
> 
> 
>> * refactoring gmetad and gmond into a single process that shares memory
> 
> i dont think is a good idea as there are processes with different
> functionality in mind. that would make an process very heavy even if you
> dont start the gmetad part. (and basically what ganglia is excelling is
> as a simple, light weight and robust agent based monitoring tool)
> 
> I would want to help if its possible but i would need also some mentoring.
> 
> Thank you!
> Adrian
> 
> ------------------------------------------------------------------------------
> Sponsored by Intel(R) XDK 
> Develop, test and display web and hybrid apps with a single code base.
> Download it for free now!
> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk_______________________________________________
> Ganglia-developers mailing list
> Ganglia-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/ganglia-developers

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to