On 1/28/14 11:18 AM, Sergio Ballestrero wrote:
> On 28 Jan 2014, at 20:10, Adam Compton <acomp...@quantcast.com> wrote:
>
>> The gmond "globals" configuration option "host_tmax" controls how long a 
>> host can go without a heartbeat before being seen as "down"; it's set to 20 
>> by default, but the value in the config file gets multiplied by 4, so the 
>> default timeout is actually 80 seconds.
>>
>> http://manpages.ubuntu.com/manpages/raring/man5/gmond.conf.5.html#contenttoc2
>>
>> - Adam
> Sure, this is the setting of the "collector" gmon, but what I don't get is 
> the meaning of the gmetric option. Are you implying that it is just ignored, 
> and only the gmond setting counts?
There are separate tmax and dmax settings for each host, and for each 
metric for a host. If a given metric doesn't receive an update within 
its last specified tmax (from gmetric --tmax=NN), then the metric will 
be considered down; if a host receives no metric updates or heartbeats 
at all within its pre-configured host_tmax (from gmond.conf), _it_ will 
be considered down. I believe (but do not recall offhand) that all 
metrics for a down host are considered down, so e.g. if the host tmax is 
80 seconds and a given metric has a tmax of 600 seconds, once 80 seconds 
elapse with no heartbeat or updates both the host and the metric are 
considered down even though the metric's tmax has not been reached.

- Adam

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to