Thanks very much Nicholas. Your reply was very helpful and we are going
to try out your settings changes and patches.
On 09/17/2012 09:03 AM, Nicholas Satterly wrote:
> Hi Chris,
>
> I've discovered there are two contributing factors to problems like this.
>
> 1. the number of metrics being sen
in gmond.c:process_tc_accept_channel() could those "goto" statements close the
socket and return without relinquishing the mutex?
Neil
On Sep 19, 2012, at 8:45 AM, Nicholas Satterly wrote:
> Hi Peter,
>
> Thanks for the feedback.
>
> I've added a thread mutex to the hosts hash table as you su
Hi Peter,
I've submitted another pull request covering a mutex for the hostdata hash
table.
Thanks again for your guidance.
Regards,
Nick
On Wed, Sep 19, 2012 at 5:53 PM, Peter Phaal wrote:
> Nick,
>
> I think you probably need two mutexes if you want to avoid blocking
> the UDP thread unnece
Nick,
I think you probably need two mutexes if you want to avoid blocking
the UDP thread unnecessarily.
1. a mutex on the hastable that must be grabbed by the TCP thread when
it walks the hash table and the UDP thread would grab it any time it
adds or removes an entry from the hash table.
2. a mu
Hi Peter,
Thanks for the feedback.
I've added a thread mutex to the hosts hash table as you suggested and will
send a pull request in the next day or so.
Regards,
Nick
On Mon, Sep 17, 2012 at 8:25 PM, Peter Phaal wrote:
> Nicholas,
>
> It makes sense to multi-thread gmond, but looking at your
Nicholas,
It makes sense to multi-thread gmond, but looking at your patch, I
don't see any locking associated with the hosts hashtable. Isn't there
a possible race if new hosts/metrics are added to the hashtable by the
UDP thread at the same time the hashtable is being walked by the TCP
thread?
P
Hi Chris,
I've discovered there are two contributing factors to problems like this.
1. the number of metrics being sent (possibly in short bursts) can overflow
the UDP receive buffer.
2. the time it takes to process metrics in the UDP receive buffer causes
TCP connections from the gmetad's to tim
We use ganglia to monitor > 500 hosts in multiple datacenters with about
90k unique host:metric pairs per DC. We use this data for all of the
cool graphs in the web UI and for passive alerting.
One of our checks is to measure TN of load_one on every box (we want to
make sure gmond is working and
8 matches
Mail list logo