Thanks for your reply.

1. So my understanding is that Ganglia offers no application instance 
management help per se - it's up to the collection module that we write to 
worry about differentiating between common metrics reported by a set of 
application instances on a node.

2. Since the limit is what will fit into UDP packet, is there any way to 
disassemble/re-assemble a variable sized metric that sometimes exceeds a 
UDP packet size?

4. If the metrics to be collected change for any reason (add a new one, 
delete an old one, change the collection interval, etc.) does Ganglia 
automatically detect such changes in flight simply by a configuration 
change, or do all daemons need to be purposefully restarted?

Lou.




"Brad Nicholes" <[EMAIL PROTECTED]> 
01/14/2008 02:06 PM

To
<ganglia-developers@lists.sourceforge.net>, Lou Degenaro/Watson/[EMAIL 
PROTECTED]
cc

Subject
[Ganglia-developers] questions re: user-defined instance datacollection, 
reporting, and listening






>>> On 1/14/2008 at 7:41 AM, in message
<[EMAIL PROTECTED]>, 
Lou
Degenaro <[EMAIL PROTECTED]> wrote:
> I'm looking for help in understanding if Ganglia can be used to monitor 
a 
> cluster relative to our needs. 
> 
> Questions:
> 
> 1. In our situation. each node has one or more application "instances" 
> running, where each application instance has an id and consists of a set 

> of daemons running on the node.  Thus app-1 may be comprised of one set 
of 
> instances of daemons a,b,c and app-2 may be comprised of another set of 
> instances of daemons a,b,c.  The question is, can Gangalia be used to 
> monitor instance specific information.  For example, say daemon "a" 
> produces metric "m".  We'd like to collect metric "m" from app-1 and 
from 
> app-2, where both apps are on the same node, and be able to tell which 
"m" 
> is which.
> 

Using Ganglia 3.1.0 (which has not been formally released yet) you can 
write either C interface or Python interface modules that can collect and 
report anything you want.  In your case, as long as the names of the 
metrics are different, gathering metrics for app-1 and app-2 can be easily 
done.  There is already a similar module in concept, that gathers metrics 
for each CPU that it discovers in the system.  In other words, if the 
system has 4 CPUs, then 4 metrics are gathered and reported.  The 
difference here is that we are gathering multiple CPU data and you would 
be gathering data for multiple instances of daemon running on the system. 
The bottomline is if your module can differentiate between multiple 
entities on the system and gather data for each, ganglia can report it.

> 2. What is the size limitation of Ganglia collected data - limited by 
what 
> can fit into a UDP packet?
> 

Depends on what you are talking about.  Yes the limit is the size of a UDP 
packet however 90% of the time the data being collected is just a numeric 
value.  You can gather a string but this is usually only done for "collect 
once" constant type metrics.

> 3. How does one programmatically get collected data?  Can our 
"collector" 
> application daemon easily listen for the reports or must polling be 
used?
> 

Normally this is done through some type of polling mechanism.  You can 
either poll the gmetad port and/or query the data directly from the RRD 
files.  Examples of how to do this are in the web frontend PHP code.


> 4. Can the set of nodes reporting / collected be (programmatically) 
> changed on-the-fly?
> 

The set of nodes that are collecting data is determined by whether gmond 
is running on the machine or not.  If a new machine enters the grid and 
gmond is started on that box, the report for that machine magically shows 
up.  If a machine shutdown or gmond is stopped, the machine is reported as 
down.  Stopped machines can also be configured to be cleaned after a 
period of time.

> 5. Do all nodes have to have all information?  Can a hierarchy be 
> established so that only a small set of "authority" nodes be the keepers 

> of all information in order to minimize network traffic?
> 

No and Yes.  The default configuration is that all nodes hold information 
about every other node (multicast mode).  However the nodes can be 
configured to only report to a single node (unicast mode). 
 
Brad


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to