Even when always restarting on every glitch we need to monitor the inside of the vr to know when to restart/respin a new vr. There is much functionality present on the vr an for us it is not possible to say for sure what is important to a customer installation so the admin should be able to define the minimal reqs that will stop us from spinning up a new vr. And there must be tools present for monitoring these reqs.
makes sense? On Thu, Sep 26, 2013 at 10:01 PM, David Nalley <da...@gnsa.us> wrote: > For what it's worth we created an ACS-specific MIB (beneath the > org.apache MIB) so really this is just a matter of defining and > publishing it. > > But lets think about monit being used to restart services - with HA, > Redundant VR, are we sure that we want to inject yet another point of > control into things? Is it better to just respawn an instance since > they are essentially stateless? I don't know, but management server, > local daemons, and other SysVMs making decisions seems like we are > increasing complexity. > > --David > > On Thu, Sep 26, 2013 at 10:31 AM, Chiradeep Vittal > <chiradeep.vit...@citrix.com> wrote: > > In this case you would have to invent another enterprise MIB. Not too > > hard, but I'd argue that it needs to be proxied through some other > service > > anyway and it represents a different integration point with ACS. Depends > > on whether you consider the system vm part of the ACS deployment, or an > > entity like a host. > > > > On 9/26/13 10:27 AM, "Alex Huang" <alex.hu...@citrix.com> wrote: > > > >>Using SNMP for alert notification is not a bad idea though. I don't see > >>why we can't do that instead of posting to the management server. This > >>is specifically referring to the second part of the proposal. Why > >>reinvent that part of it? > >> > >>--Alex > >> > >>> -----Original Message----- > >>> From: Chiradeep Vittal [mailto:chiradeep.vit...@citrix.com] > >>> Sent: Wednesday, September 25, 2013 10:28 PM > >>> To: dev@cloudstack.apache.org > >>> Subject: Re: [PROPOSAL] Service monitoring tool in virtual router > >>> > >>> SNMP wouldn't restart a failed process nor would it generate alerts. It > >>>is > >>> simply too generic for the requirements outlined here. The proposal > does > >>> not talk about modifying monit, just using it. That wouldn't trigger > >>>the AGPL. > >>> I think the idea is to have a tight monitoring loop that scales: so > >>>executing the > >>> monitoring loop in-situ makes sense. > >>> > >>> > >>> On 9/25/13 9:53 PM, "David Nalley" <da...@gnsa.us> wrote: > >>> > >>> >On Wed, Sep 25, 2013 at 9:30 AM, Jayapal Reddy Uradi > >>> ><jayapalreddy.ur...@citrix.com> wrote: > >>> >> Hi, > >>> >> > >>> >> Currently in virtual router there is no way to recover and notify if > >>> >>some service goes down unexpectedly. > >>> >> > >>> >> This feature is about monitoring all the services rendered by the > >>> >>virtual router, ensure that the services are running through the life > >>> >>time of the VR. > >>> >> > >>> >> On service failure: > >>> >> 1. Generate an alert and event indicating failure 2. Restart the > >>> >> service > >>> >> > >>> >> Services to be monitored: > >>> >> DHCP, DNS, haproxy, password server etc. > >>> >> > >>> >> As part of monitoring there are two activities > >>> >> > >>> >> 1. One is monitoring the services in VR and log the events. Using > >>> >>monit for monitoring services 2. Second part is pushing alerts from > >>> >>router to MS server. Thinking on POST the logs to web server in MS. > >>> >> > >>> >> I will be updating more details and FS in this thread. > >>> >> > >>> >> I created enhancement bug for this. > >>> >> https://issues.apache.org/jira/browse/CLOUDSTACK-4736 > >>> >> > >>> >> Thanks, > >>> >> Jayapal > >>> > > >>> >So several things - why not make this via SNMP? Query processes, and > >>> >many other things. This should be relatively simple, is well known, > can > >>> >be locked down (or could be monitored for many other things by > external > >>> >monitoring packages) and is the defacto standard for monitoring hosts. > >>> >Second - monit is Affero GPL licensed - which is a cat-x license. > >>> >While I expect that we would merely use this and not do any hacking on > >>> >it - I think its inclusion might be a surprise (and forbidden in many > >>> >environments) to our users > >>> > > >>> >--David > >> > > >