Hi Matt.

I welcome some of the new features you are planning to introduce for 2.6.0. I have downloaded this snapshot and I'll try and I'm looking forward to a working version I can test out too. From your description I understand that a metric collection_group is used to specify a bunch of related metrics that will be transmitted together and this should save bandwidth. Will gmond ONLY transmit those metrics that are listed in these groups so you can restrict what metrics are send from each gmond? I would certainly be in favor of this. I would also like the metric threshold data interpreted by gmetad so that data is only written to round robin database files when necessary.

In the short term, are we close to a 2.5.8 release date? I've been waiting patiently for solaris' gmond to catch up with the linux implementation!!

-------
Yemi

On Dec 12, 2004, at 3:36 AM, Matt Massie wrote:

guys

i just uploaded a new snapshot of 2.6.0 to
http://matt-massie.com/ganglia/ganglia-2.6.0.200412121042.tar.gz

this snapshot has multicast support (although setting the TTL isn't supported yet although the code is mostly written). all the multicast code is in ./lib/apr_net.[c,h].

the configuration file format for UDP recv channels is

# listen for unicast traffic on port 8649
udp_recv_channel {
  port = 8649
}

# listen for multicast traffic on port 8649 channel 239.2.11.71
udp_recv_channel {
  mcast_join = 239.2.11.71
  port = 8649
}

there is no limit to the number of unicast and multicast channels you can specify (except for memory and file descriptor limitations).

we have about 6 clusters around the department on separate multicast channels. as a test i ran gmond with multiple udp_recv_channels equal to those channels and it heard the traffic from them all.

currently, this gmond will send/recv UDP message on any unicast/multicast channel. i just have a few small details to test and cleanup but it's mostly finished (i think i can even track down the windows code to support multicast on windows..we'll see). this 2.6.0 snapshot compile and runs on windows now without multicast.

this gmond also has an internal hostname lookup cache to prevent repeating DNS queries (just as the old gmond did).. because some people don't run nscd and having a 1000 nodes hitting a DNS server is just not nice.

if you look at the code in ./gmond/gmond.c you'll see that when new hosts are heard from a new host structure is created and added to the host cache.

the next step (which i will work on monday) is to save incoming message to the host hash. this will require that i finish tweaking the protocol.x and build some wrapper code around it.

once that is finished, i just need to write code to walk the host hash and output xml.

lastly, i will need to write a simple cleanup function which enforces the host_dmax attribute.

i'm hoping to have a working/testable version of 2.6.0 this week. i will be leaving for christmas break this weekend and returning the first week of january. i probably won't be working too hard on ganglia at that time... but i might have some time.

hope you guys are having (had) a great weekend.

btw, i'm working on setting up an opendarwin box for testing ganglia on darwin. i'm aware there are a few problems compiling it on darwin. i tried to test it on the sourceforge darwin boxes in the compile farm but was painfully slow (i don't know if it would have finished in 24 hours).
the duplicate key_metric symbols problem is easy to fix.

-matt

--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'

   They that can give up essential liberty to obtain a little
      temporary safety deserve neither liberty nor safety.
  --Benjamin Franklin, Historical Review of Pennsylvania, 1759


Reply via email to