Hi Greg, Thank you Adam for followup. This is new feature for masakari-monitors and think Masakari can accommodate this feature in masakari-monitors. From the implementation prospective, it is not that hard to do. However, as you can see in our Boston presentation, Masakari will replace its monitoring parts ( which is masakari-monitors) with, nova-host-alerter, **-process-alerter, and **-instance-alerter. (** part is not defined yet..:p)... Therefore, I would like to save this specifications, and make sure we will not miss anything in the transformation.. Does is make sense to write simple spec for this in masakari-spec [1]? So we can discuss about the requirements how to implement it.
[1] https://github.com/openstack/masakari-specs --- Regards, Sampath On Thu, May 18, 2017 at 2:29 AM, Adam Spiers <[email protected]> wrote: > I don't see any reason why masakari couldn't handle that, but you'd > have to ask Sampath and the masakari team whether they would consider > that in scope for their roadmap. > > Waines, Greg <[email protected]> wrote: >> >> Sure. I can propose a new user story. >> >> And then are you thinking of including this user story in the scope of >> what masakari would be looking at ? >> >> Greg. >> >> >> From: Adam Spiers <[email protected]> >> Reply-To: "[email protected]" >> <[email protected]> >> Date: Wednesday, May 17, 2017 at 10:08 AM >> To: "[email protected]" >> <[email protected]> >> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat / >> Healthcheck Monitoring >> >> Thanks for the clarification Greg. This sounds like it has the >> potential to be a very useful capability. May I suggest that you >> propose a new user story for it, along similar lines to this existing >> one? >> >> >> http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html >> >> Waines, Greg <[email protected]<mailto:[email protected]>> >> wrote: >> Yes that’s correct. >> VM Heartbeating / Health-check Monitoring would introduce intrusive / >> white-box type monitoring of VMs / Instances. >> >> I realize this is somewhat in the gray-zone of what a cloud should be >> monitoring or not, >> but I believe it provides an alternative for Applications deployed in VMs >> that do not have an external monitoring/management entity like a VNF Manager >> in the MANO architecture. >> And even for VMs with VNF Managers, it provides a highly reliable >> alternate monitoring path that does not rely on Tenant Networking. >> >> You’re correct, that VM HB/HC Monitoring would leverage >> https://wiki.libvirt.org/page/Qemu_guest_agent >> that would require the agent to be installed in the images for talking >> back to the compute host. >> ( there are other examples of similar approaches in openstack ... the >> murano-agent for installation, the swift-agent for object store management ) >> Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest >> Agent, the messaging path is internal thru a QEMU virtual serial device. >> i.e. a very simple interface with very few dependencies ... it’s up and >> available very early in VM lifecycle and virtually always up. >> >> Wrt failure modes / use-cases >> >> · a VM’s response to a Heartbeat Challenge Request can be as >> simple as just ACK-ing, >> this alone allows for detection of: >> >> o a failed or hung QEMU/KVM instance, or >> >> o a failed or hung VM’s OS, or >> >> o a failure of the VM’s OS to schedule the QEMU Guest Agent daemon, or >> >> o a failure of the VM to route basic IO via linux sockets. >> >> · I have had feedback that this is similar to the virtual hardware >> watchdog of QEMU/KVM ( >> https://libvirt.org/formatdomain.html#elementsWatchdog ) >> >> · However, the VM Heartbeat / Health-check Monitoring >> >> o provides a higher-level (i.e. application-level) heartbeating >> >> § i.e. if the Heartbeat requests are being answered by the Application >> running within the VM >> >> o provides more than just heartbeating, as the Application can use it to >> trigger a variety of audits, >> >> o provides a mechanism for the Application within the VM to report a >> Health Status / Info back to the Host / Cloud, >> >> o provides notification of the Heartbeat / Health-check status to >> higher-level cloud entities thru Vitrage >> >> § e.g. VM-Heartbeat-Monitor - to - Vitrage - (EventAlarm) - Aodh - ... >> - VNF-Manager >> >> - (StateChange) - Nova - ... - VNF Manager >> >> >> Greg. >> >> >> From: Adam Spiers <[email protected]<mailto:[email protected]>> >> Reply-To: >> "[email protected]<mailto:[email protected]>" >> <[email protected]<mailto:[email protected]>> >> Date: Tuesday, May 16, 2017 at 7:29 PM >> To: >> "[email protected]<mailto:[email protected]>" >> <[email protected]<mailto:[email protected]>> >> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat / >> Healthcheck Monitoring >> >> Waines, Greg >> <[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]%3e>> >> wrote: >> thanks for the pointers Sam. >> >> I took a quick look. >> I agree that the VM Heartbeat / Health-check looks like a good fit into >> Masakari. >> >> Currently your instance monitoring looks like it is strictly black-box >> type monitoring thru libvirt events. >> Is that correct ? >> i.e. you do not do any intrusive type monitoring of the instance thru the >> QUEMU Guest Agent facility >> correct ? >> >> That is correct: >> >> >> https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/instancemonitor/instance.py >> >> I think this is what VM Heartbeat / Health-check would add to Masaraki. >> Let me know if you agree. >> >> OK, so you are looking for something slightly different I guess, based >> on this QEMU guest agent? >> >> https://wiki.libvirt.org/page/Qemu_guest_agent >> >> That would require the agent to be installed in the images, which is >> extra work but I imagine quite easily justifiable in some scenarios. >> What failure modes do you have in mind for covering with this >> approach - things like the guest kernel freezing, for instance? > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
