Re: [Ganglia-developers] g3 design

matt massie Thu, 01 May 2003 15:14:23 -0700

> However, my concern is how do we set the timeout? What might be
> reasonable on one machine would be too short on a slower one, etc.


i think a very generous timeslice (say a minute or two) would be ok.  the
metric collection functions _should_ be very short-lived.  if a function
triggers a timeout, an alert would be generated and the function take out
of the event queue.  (or after n timeouts or whatever...).  

> > g3mond -
> >   acts just like a deaf 2.x gmond.  it just loads all the metric 
> >   modules and send measurements on the wire.  it will also have
> >   a local unix socket open for other apps/daemons to bootstrap the
> >   registered metric attributes.  this will use the xml-> data functions
> >   and is necessary to allow condensed network messages to be sent when
> >   modules are obiquitous.
> 
> I like it. Deaf is good for this part. What about DSO metrics?

that's what i meant when i said "it just loads all the metric modules 
...".  it'll send heartbeat messages (regardless of whether any modules 
are loaded) and metric messages generated but the DSO metric modules.

> > g3listend -
> >   acts just like a mute 2.x gmond.  it just listens to all the traffic
> >   from one or more data sources, saves it, and exports it in different
> >   formats (xml, ldif, sexpressions).  we'll start with xml alone (since
> >   it's already written :).
> >
> I assume g3xmld is a g3listend that speaks xml.

oops.  yeh.

> > g3rrdd -
> >   it just listens to one or more data sources and saves the data
> >   to round-robin databases.  (it could also be made to "serve"
> >   historical images but i'm not sure it that is wise or not.  
> > thoughts?)
> 
> I like it. By listening on the multicast channel, keeping RRDs can be 
> more efficient.
> However, why should this be different than g3xmld (assuming we always 
> want to keep rrds).

it won't be any different really.  the reason i think it's wise to
separate the two processes (unlike gmetad is now) is to reduce the
potential for blocking.  both daemons will use the same g3 library calls
but act differently on callback.  g3listend (g3xmld) will place the data
in an in-memory data structure and g3rrdd will write to disk.  originally 
i thought of having a single daemon the does multiple tasks when it 
receives data: writes to rrd, internal tree structure, and mysql 
database.. through the g3 api.  then i realized that is a recipe for 
problems.  say the database blocked.. having unique daemons isolated 
failures better.. i think.

> > if you run g3xmld, with multiple data sources (to other g3xmld for
> > example) and g3rrdd on a machine.. you have a 2.x gmetad.
> 
> The 2.x gmetad does not need access to the multicast channel. Perhaps 
> g3xmld can operate either way (with or without multicast access). Also, 
> who will do the aggregation - g3xmld or g3rrd?

both can.  we need to think of data sources more generically here.  
g3xmld can plug into a 1 or more sources.  it could be that it listens to
a single multicast data source (like gmond 2).. or it plugs into a
multicast channel and one tcp unicast channel to a remote g3xmld.  when
you think of it, gmond now is an aggregator.  it aggregates all multicast
traffic together. 

> > i just made all that XML up as an example (it's nothing solid).  the 
> > nice
> > thing is that whatever the format it can be slurped in very easily.  
> > the
> > only assumption the tree library makes is that:
> >
> >   1. all tags have at least one attribute
> >   2. the first attribute is the index attribute
> 
> XML does not enforce the attribute order. The index attribute should 
> have a well-known name.

that is the way i originally wrote the library.  the index attribute was 
always "id".  i wanted something more flexible though (i could drop back 
to the old way).  for example, the root tag is
<ganglia_root version="3.0.0">

it doesn't make sense to be
<ganglia_root id="3.0.0">

there is only one root.  i guess i could write the tree library to assume 
that tag without an index "id" will only occur once but i'm not sure 
that's the way to go.

i know xml doesn't enforce order but libg3 does.  :)

> >   3. the value of the attribute and the value of the tag combined must 
> > be
> >      unique for each level of the xml tree.
> This is similar to the First Normal Form (1NF) from relational database 
> theory.

you know.  i just did it because it worked.  you and google just put a new
wrinkle on my brain.  thanks.

> > what do you guys think?
> >
> > the tree library also allows g3listend to quickly bootstrap from a 
> > data source.
> Good.

actually.  a g3mond can bootstrap itself from itself as well if we wanted 
to.  it would be easy to have gmond write it xml tree to a file every 3 
minutes or so.  it gmond is restarted it'll just recreate it's internal 
tree structure from backup.  

one thing i haven't talked about is querying the tree.  i built the data 
structure especially for filtering, querying, summarizing, etc.  it'll be 
there.  i have the XQuery BNF and i'm tempted...

taking off to have a glass of wine or two.
-matt

Re: [Ganglia-developers] g3 design

Reply via email to