Good idea. If x and y and z are borked, initiate shutdown?

More generically, it seems we need some form of in-VM automation that can
co-ordinate with top-level orchestration

On 9/28/13 4:14 AM, "Daan Hoogland" <daan.hoogl...@gmail.com> wrote:

>Even when always restarting on every glitch we need to monitor the inside
>of the vr to know when to restart/respin a new vr. There is much
>functionality present on the vr an for us it is not possible to say for
>sure what is important to a customer installation so the admin should be
>able to define the minimal reqs that will stop us from spinning up a new
>vr. And there must be tools present for monitoring these reqs.
>
>makes sense?
>
>
>On Thu, Sep 26, 2013 at 10:01 PM, David Nalley <da...@gnsa.us> wrote:
>
>> For what it's worth we created an ACS-specific MIB (beneath the
>> org.apache MIB) so really this is just a matter of defining and
>> publishing it.
>>
>> But lets think about monit being used to restart services - with HA,
>> Redundant VR, are we sure that we want to inject yet another point of
>> control into things? Is it better to just respawn an instance since
>> they are essentially stateless? I don't know, but management server,
>> local daemons, and other SysVMs making decisions seems like we are
>> increasing complexity.
>>
>> --David
>>
>> On Thu, Sep 26, 2013 at 10:31 AM, Chiradeep Vittal
>> <chiradeep.vit...@citrix.com> wrote:
>> > In this case you would have to invent another enterprise MIB. Not too
>> > hard, but I'd argue that it needs to be proxied through some other
>> service
>> > anyway and it represents a different integration point with ACS.
>>Depends
>> > on whether you consider the system vm part of the ACS deployment, or
>>an
>> > entity like a host.
>> >
>> > On 9/26/13 10:27 AM, "Alex Huang" <alex.hu...@citrix.com> wrote:
>> >
>> >>Using SNMP for alert notification is not a bad idea though.  I don't
>>see
>> >>why we can't do that instead of posting to the management server.
>>This
>> >>is specifically referring to the second part of the proposal.  Why
>> >>reinvent that part of it?
>> >>
>> >>--Alex
>> >>
>> >>> -----Original Message-----
>> >>> From: Chiradeep Vittal [mailto:chiradeep.vit...@citrix.com]
>> >>> Sent: Wednesday, September 25, 2013 10:28 PM
>> >>> To: dev@cloudstack.apache.org
>> >>> Subject: Re: [PROPOSAL] Service monitoring tool in virtual router
>> >>>
>> >>> SNMP wouldn't restart a failed process nor would it generate
>>alerts. It
>> >>>is
>> >>> simply too generic for the requirements outlined here. The proposal
>> does
>> >>> not talk about modifying monit, just using it. That wouldn't trigger
>> >>>the AGPL.
>> >>> I think the idea is to have a tight monitoring loop that scales: so
>> >>>executing the
>> >>> monitoring loop in-situ makes sense.
>> >>>
>> >>>
>> >>> On 9/25/13 9:53 PM, "David Nalley" <da...@gnsa.us> wrote:
>> >>>
>> >>> >On Wed, Sep 25, 2013 at 9:30 AM, Jayapal Reddy Uradi
>> >>> ><jayapalreddy.ur...@citrix.com> wrote:
>> >>> >> Hi,
>> >>> >>
>> >>> >> Currently in virtual router there is no way to recover and
>>notify if
>> >>> >>some service goes down unexpectedly.
>> >>> >>
>> >>> >> This feature is about monitoring all the services rendered by the
>> >>> >>virtual router, ensure that the services are running through the
>>life
>> >>> >>time of the VR.
>> >>> >>
>> >>> >> On service failure:
>> >>> >> 1. Generate an alert and event indicating failure 2. Restart the
>> >>> >> service
>> >>> >>
>> >>> >> Services to be monitored:
>> >>> >> DHCP, DNS, haproxy, password server etc.
>> >>> >>
>> >>> >> As part of monitoring there are two activities
>> >>> >>
>> >>> >> 1. One is monitoring the services in VR and log the events. Using
>> >>> >>monit for monitoring services  2. Second part is pushing alerts
>>from
>> >>> >>router to  MS server. Thinking on POST the logs to web server in
>>MS.
>> >>> >>
>> >>> >> I will be updating more details and FS in this thread.
>> >>> >>
>> >>> >> I created enhancement bug for this.
>> >>> >> https://issues.apache.org/jira/browse/CLOUDSTACK-4736
>> >>> >>
>> >>> >> Thanks,
>> >>> >> Jayapal
>> >>> >
>> >>> >So several things - why not make this via SNMP? Query processes,
>>and
>> >>> >many other things. This should be relatively simple, is well known,
>> can
>> >>> >be locked down (or could be monitored for many other things by
>> external
>> >>> >monitoring packages) and is the defacto standard for monitoring
>>hosts.
>> >>> >Second - monit is Affero GPL licensed - which is a cat-x license.
>> >>> >While I expect that we would merely use this and not do any
>>hacking on
>> >>> >it - I think its inclusion might be a surprise (and forbidden in
>>many
>> >>> >environments) to our users
>> >>> >
>> >>> >--David
>> >>
>> >
>>

Reply via email to