On Fri, Jan 7, 2011 at 15:25, Bernard Li <bern...@vanhpc.org> wrote:
> Hi all:
>
> Since the release of Ganglia 3.1, we have introduced the new
> configuration option send_metadata_interval in gmond.conf.  This is
> set to 0 by default and the user must set this to a sane number if
> using unicast otherwise if gmonds are restarted, hosts may appear to
> be offline (this is documented in the release notes).  A bug has
> already been filed:
>
> http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=242
>
> We recently have a lot of users having this issue and Vladimir
> recommend that we just set a sane number as the default and be done
> with it, since we end up spending a lot of time on IRC/mailing-list to
> solve the same problem over and over again.
>
> Since there have been some commits to the 3.1 branch since tagging
> 3.1.7, I propose we just copy 3.1.7 tag, update the send_meta_data
> interval in the configuration file and release that as 3.1.8.
>
> This is not the normal procedure for making a release, so I'd like to
> get some feedback from other developers.
>
> BTW I am thinking of setting send_metadata_interval to 30 seconds.
> Also, does anybody know if this setting affects multicast setups in
> any way?

I think that it's fine to set this to a non-zero value, but I wonder
if 30 seconds is too high.  I did a quick set of checking on the
actual packets that are sent--and specifically the metadata packets.
I haven't been able to really delve into the code to figure exactly
what's going on (this part of the code is't terribly transparent to
me), but I *think* that they are really large--on the order of several
KB when fully assembled, as compared to less than 100-120 bytes for a
typical metric packet .  I think that size will increase with the
number of metrics stored, since each one must be described in full XML
each time.

The reason for the large size is that an entire XML description of the
metrics appears to be sent each time.  Metadata packets also appear to
go over TCP, not UDP.

My testing was pretty simple:
1) setup a gmond (from SVN, well after 3.1 came out) in unicast mode.
2) set 'send_metadata_interfaval' to 1
3) disable all modules, except for 'mod_core'
4) remove all collection groups.
5) start gmond, and run tcpdump.

On a large cluster, with lots of metrics per host, I can see problems
if the metadata packets are sent too frequently.  I have hosts that
send well over 300 metrics (lots of CPU cores makes for lots of
metrics...).  Each of these need to be described in the metadata
packets.

So I think that setting a non-zero default is fine.  But think that
something like 300 or 600 seconds would be preferable.


-- 
Jesse Becker

------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to