On Thursday, May 1, 2003, at 11:54 AM, matt massie wrote:

i've thought more about what steve said earlier and i think we might want
to use a simple timestamp instead of human-readable 8601 timestamps.
there are libraries that handle standard 8601 date strings but i can't
believe that it's faster than atoi().  we'll see .. i'll do some
benchmarks later.

Agree.

we could use this approach to put a boundary on the amount of time they
get to run.  it they exceed the time, an alert is sent and gmond keeps
right on chugging.  we have to be smart though since on many operating
systems use SIGALRM for sleep(), etc.  we'll see.

setjmp() is a very cool system call that can do amazing things. I once used it when writing a checkpointing library. However, my concern is how do we set the timeout? What might be reasonable on one machine would be too short on a slower one, etc.

g3mond -
acts just like a deaf 2.x gmond. it just loads all the metric modules
  and send measurements on the wire.  it will also have
  a local unix socket open for other apps/daemons to bootstrap the
  registered metric attributes.  this will use the xml-> data functions
  and is necessary to allow condensed network messages to be sent when
  modules are obiquitous.

I like it. Deaf is good for this part. What about DSO metrics?

g3listend -
  acts just like a mute 2.x gmond.  it just listens to all the traffic
  from one or more data sources, saves it, and exports it in different
  formats (xml, ldif, sexpressions).  we'll start with xml alone (since
  it's already written :).

I assume g3xmld is a g3listend that speaks xml.
  output modules will pass a file descriptor, callbacks to tree foreach
  which recurses the tree structure.

Good. We can have one g3mond per host, and only a few g3xmlds on select monitoring nodes in the cluster (I am thinking from a cluster-monitoring stance of course). This will
enable the minimum memory footprint on compute nodes, etc.

i don't want to lock people into multicast (although i personally think
it's the best way to do business) in g3. support for tcp and unicast udp
will be there (it's already in the library).

We need broadcast too - I have code for this.

g3rrdd -
  it just listens to one or more data sources and saves the data
  to round-robin databases.  (it could also be made to "serve"
historical images but i'm not sure it that is wise or not. thoughts?)

I like it. By listening on the multicast channel, keeping RRDs can be more efficient. However, why should this be different than g3xmld (assuming we always want to keep rrds).


if you run g3xmld, with multiple data sources (to other g3xmld for
example) and g3rrdd on a machine.. you have a 2.x gmetad.

The 2.x gmetad does not need access to the multicast channel. Perhaps g3xmld can operate either way (with or without multicast access). Also, who will do the aggregation - g3xmld or g3rrd?

all apps/daemons will get there configuration information from a single
/etc/ganglia.conf configuration file which i would like to have written
in..  you guessed it.. XML.  again it'll just use the tree library.  it
would look something like.

Sure.
i just made all that XML up as an example (it's nothing solid). the nice thing is that whatever the format it can be slurped in very easily. the
only assumption the tree library makes is that:

  1. all tags have at least one attribute
  2. the first attribute is the index attribute

XML does not enforce the attribute order. The index attribute should have a well-known name.
3. the value of the attribute and the value of the tag combined must be
     unique for each level of the xml tree.
This is similar to the First Normal Form (1NF) from relational database theory.

what do you guys think?

the tree library also allows g3listend to quickly bootstrap from a data source.
Good.

The design sounds good.



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Federico

Rocks Cluster Group, SDSC, San Diego, CA


Reply via email to