Re: [PROPOSAL] Service monitoring tool in virtual router

Koushik Das Mon, 30 Sep 2013 22:47:57 -0700

This is a very useful feature. Can this be extended to the other system VMs? 
SSVM and CPVM


Based on the discussion I see that there is an assumption that restarting 
services/rebooting should fix the issues. Is that always true? What if the 
service fails to restart after repeated attempts? What is the fallback?

-Koushik


On 01-Oct-2013, at 3:15 AM, Chiradeep Vittal <chiradeep.vit...@citrix.com> 
wrote:

> Good idea. If x and y and z are borked, initiate shutdown?
> 
> More generically, it seems we need some form of in-VM automation that can
> co-ordinate with top-level orchestration
> 
> On 9/28/13 4:14 AM, "Daan Hoogland" <daan.hoogl...@gmail.com> wrote:
> 
>> Even when always restarting on every glitch we need to monitor the inside
>> of the vr to know when to restart/respin a new vr. There is much
>> functionality present on the vr an for us it is not possible to say for
>> sure what is important to a customer installation so the admin should be
>> able to define the minimal reqs that will stop us from spinning up a new
>> vr. And there must be tools present for monitoring these reqs.
>> 
>> makes sense?
>> 
>> 
>> On Thu, Sep 26, 2013 at 10:01 PM, David Nalley <da...@gnsa.us> wrote:
>> 
>>> For what it's worth we created an ACS-specific MIB (beneath the
>>> org.apache MIB) so really this is just a matter of defining and
>>> publishing it.
>>> 
>>> But lets think about monit being used to restart services - with HA,
>>> Redundant VR, are we sure that we want to inject yet another point of
>>> control into things? Is it better to just respawn an instance since
>>> they are essentially stateless? I don't know, but management server,
>>> local daemons, and other SysVMs making decisions seems like we are
>>> increasing complexity.
>>> 
>>> --David
>>> 
>>> On Thu, Sep 26, 2013 at 10:31 AM, Chiradeep Vittal
>>> <chiradeep.vit...@citrix.com> wrote:
>>>> In this case you would have to invent another enterprise MIB. Not too
>>>> hard, but I'd argue that it needs to be proxied through some other
>>> service
>>>> anyway and it represents a different integration point with ACS.
>>> Depends
>>>> on whether you consider the system vm part of the ACS deployment, or
>>> an
>>>> entity like a host.
>>>> 
>>>> On 9/26/13 10:27 AM, "Alex Huang" <alex.hu...@citrix.com> wrote:
>>>> 
>>>>> Using SNMP for alert notification is not a bad idea though.  I don't
>>> see
>>>>> why we can't do that instead of posting to the management server.
>>> This
>>>>> is specifically referring to the second part of the proposal.  Why
>>>>> reinvent that part of it?
>>>>> 
>>>>> --Alex
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Chiradeep Vittal [mailto:chiradeep.vit...@citrix.com]
>>>>>> Sent: Wednesday, September 25, 2013 10:28 PM
>>>>>> To: dev@cloudstack.apache.org
>>>>>> Subject: Re: [PROPOSAL] Service monitoring tool in virtual router
>>>>>> 
>>>>>> SNMP wouldn't restart a failed process nor would it generate
>>> alerts. It
>>>>>> is
>>>>>> simply too generic for the requirements outlined here. The proposal
>>> does
>>>>>> not talk about modifying monit, just using it. That wouldn't trigger
>>>>>> the AGPL.
>>>>>> I think the idea is to have a tight monitoring loop that scales: so
>>>>>> executing the
>>>>>> monitoring loop in-situ makes sense.
>>>>>> 
>>>>>> 
>>>>>> On 9/25/13 9:53 PM, "David Nalley" <da...@gnsa.us> wrote:
>>>>>> 
>>>>>>> On Wed, Sep 25, 2013 at 9:30 AM, Jayapal Reddy Uradi
>>>>>>> <jayapalreddy.ur...@citrix.com> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> Currently in virtual router there is no way to recover and
>>> notify if
>>>>>>>> some service goes down unexpectedly.
>>>>>>>> 
>>>>>>>> This feature is about monitoring all the services rendered by the
>>>>>>>> virtual router, ensure that the services are running through the
>>> life
>>>>>>>> time of the VR.
>>>>>>>> 
>>>>>>>> On service failure:
>>>>>>>> 1. Generate an alert and event indicating failure 2. Restart the
>>>>>>>> service
>>>>>>>> 
>>>>>>>> Services to be monitored:
>>>>>>>> DHCP, DNS, haproxy, password server etc.
>>>>>>>> 
>>>>>>>> As part of monitoring there are two activities
>>>>>>>> 
>>>>>>>> 1. One is monitoring the services in VR and log the events. Using
>>>>>>>> monit for monitoring services  2. Second part is pushing alerts
>>> from
>>>>>>>> router to  MS server. Thinking on POST the logs to web server in
>>> MS.
>>>>>>>> 
>>>>>>>> I will be updating more details and FS in this thread.
>>>>>>>> 
>>>>>>>> I created enhancement bug for this.
>>>>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-4736
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Jayapal
>>>>>>> 
>>>>>>> So several things - why not make this via SNMP? Query processes,
>>> and
>>>>>>> many other things. This should be relatively simple, is well known,
>>> can
>>>>>>> be locked down (or could be monitored for many other things by
>>> external
>>>>>>> monitoring packages) and is the defacto standard for monitoring
>>> hosts.
>>>>>>> Second - monit is Affero GPL licensed - which is a cat-x license.
>>>>>>> While I expect that we would merely use this and not do any
>>> hacking on
>>>>>>> it - I think its inclusion might be a surprise (and forbidden in
>>> many
>>>>>>> environments) to our users
>>>>>>> 
>>>>>>> --David
>>>>> 
>>>> 
>>> 
>

Re: [PROPOSAL] Service monitoring tool in virtual router

Reply via email to