Re: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Yujun Zhang Fri, 06 Jan 2017 23:32:09 -0800

The two questions raised by YinLiYin is actually one, i.e. *how to enrich
the alarm properties *that can be used as an condition in root cause
deducing.


Both 'suspect' or 'datasource' are additional information that may be
referred as a condition in general fault model, a.k.a. scenario in vitrage.

It seems it could be done by

   1. introduce a flexible `metadata` dict in to ALARM entity
   2. Allow generating update event[1] on metadata change
   3. Allow using ALARM metadata in scenario condition
   4. Allow setting ALARM metadata in scenario action

This will leave the flexibility to continuous development by defining a
complex scenario template and keep the vitrage evaluator simple and generic.

My two cents.

[1]:
http://docs.openstack.org/developer/vitrage/scenario-evaluator.html#concepts-and-guidelines


On Sat, Jan 7, 2017 at 2:23 AM Afek, Ifat (Nokia - IL) <[email protected]>
wrote:

> Hi YinLiYin,
>
>
>
> This is an interesting question. Let me divide my answer to two parts.
>
>
>
> First, the case that you described with Nagios and Vitrage. This problem
> depends on the specific Nagios tests that you configure in your system, as
> well as on the Vitrage templates that you use. For example, you can use
> Nagios/Zabbix to monitor the physical layer, and Vitrage to raise deduced
> alarms on the virtual and application layers. This way you will never have
> duplicated alarms. If you want to use Nagios to monitor the other layers as
> well, you can simply modify Vitrage templates so they don’t raise the
> deduced alarms that Nagios may generate, and use the templates to show RCA
> between different Nagios alarms.
>
>
>
> Now let’s talk about the more general case. Vitrage can receive alarms
> from different monitors, including Nagios, Zabbix, collectd and Aodh. If
> you are using more than one monitor, it is possible that the same alarm
> (maybe with a different name) will be raised twice. We need to create a
> mechanism to identify such cases and create a single alarm with the
> properties of both monitors. This has not been designed in details yet, so
> if you have any suggestion we will be happy to hear them.
>
>
>
> Best Regards,
>
> Ifat.
>
>
>
>
>
> *From: *"[email protected]" <[email protected]>
> *Reply-To: *"OpenStack Development Mailing List (not for usage
> questions)" <[email protected]>
> *Date: *Friday, 6 January 2017 at 03:27
> *To: *"[email protected]" <
> [email protected]>
> *Cc: *"[email protected]" <[email protected]>, "
> [email protected]" <[email protected]>, "[email protected]" <
> [email protected]>, "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>
> *Subject: *[openstack-dev] [Vitrage] About alarms reported by datasource
> and the alarms generated by vitrage evaluator
>
>
>
> Hi all,
>
>    Vitrage generate alarms acording to the templates. All the alarms
> raised by vitrage has the type "vitrage". Suppose Nagios has an alarm A.
> Alarm A is raised by vitrage evaluator according to the action part of a
> scenario, type of alarm A is "vitrage". If Nagios reported alarm A latter,
> a new alarm A with type "Nagios" would be generator in the entity graph.
>   There would be two vertices for the same alarm in the graph. And we have
> to define two alarm entities, two relationships, two scenarios in the
> template file to make the alarm propagation procedure work.
>
>    It is inconvenient to describe fault model of system with lot of
> alarms. How to solve this problem?
>
>
>
> 殷力殷 YinLiYin
>
>
>
>
>
>
> 上海市浦东新区碧波路889号中兴研发大楼D502
> D502, ZTE Corporation R&D Center, 889# Bibo Road,
> Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203
> T: +86 21 68896229 <+86%2021%206889%206229>
> M: +86 13641895907 <+86%20136%204189%205907>
> E: [email protected]
> www.zte.com.cn
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Reply via email to