I just checked a few of my servers which have had this issue. We are using Monit 5.5 and m/monit 2.4. The monit poll time is 15 seconds. I have changed the Acceptable Report Skew from 3 to 10 for these servers. I will monitor it for a day to see if the issue persists for these servers. I will probably send the logs across if this doesn't fix it.
Also, I ran monit -v on one of the hosts, and I saw the following: M/Monit(s) = http://192.168.48.1:8080/collector with timeout 5 seconds using credentials Where is this 5 second timeout set? Will increasing this timeout fix the issue? Abdul Munim Kazia [email protected] On 23 January 2014 15:53, Martin Pala <[email protected]> wrote: > I think this isn't scalability issue - we test with far higher load then > 100 hosts (although in the long term we may add load balancing support). > > Regards, > Martin > > > On 23 Jan 2014, at 11:04, Roose, Marco <[email protected]> wrote: > > Hi, > > I think this is a really interesting topic. One should think about the > implementation of load balancing (and with that High Availability) into > m/monit. Is there anything focusing into that in the development queue? > > > > Best regards, > > Marco Roose > > > > *From:* [email protected] [ > mailto:[email protected]<[email protected]>] > *On Behalf Of *Abdul Munim Kazia > *Sent:* Thursday, January 23, 2014 10:38 AM > *To:* [email protected] > *Subject:* Hosts on M/Monit keep going offline and online at regular > intervals > > > > Hello > > > > I have been using monit and m/monit at my organization for the past year > or so. What started out as a simple setup with a handful of servers has now > become a full monitoring dashboard with more than 100 hosts listed on > m/monit. > > > > While monit itself works perfectly, at every regular interval of a few > minutes, some of the hosts get reported as "no report from monit" in the > m/monit console. The report comes in a few seconds and the host becomes > green again. This happens fairly regularly, and it happens to a different > set of hosts every time. All the hosts and the m/monit host itself are > hosted in the same data center, so I don't think that network latency is an > issue. > > > > This could be occurring because m/monit doesn't receive the data for all > hosts in time, and it wouldn't be an issue for me, if it didn't crowd the > events list with 20-30 events every half an hour. > > > > I have looked at the server.conf file, but I don't think any of those > configuration settings will help me out, if I am not wrong. Has anyone > faced this issue before? Is there any way to fix this, either from monit's > or m/monit's end? Can I increase this timeout? > > > > Thanks for you help > > > > Abdul Munim Kazia > > munimkazia.com > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general > > > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general >
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
