The two questions raised by YinLiYin is actually one, i.e. *how to enrich the alarm properties *that can be used as an condition in root cause deducing.
Both 'suspect' or 'datasource' are additional information that may be referred as a condition in general fault model, a.k.a. scenario in vitrage. It seems it could be done by 1. introduce a flexible `metadata` dict in to ALARM entity 2. Allow generating update event[1] on metadata change 3. Allow using ALARM metadata in scenario condition 4. Allow setting ALARM metadata in scenario action This will leave the flexibility to continuous development by defining a complex scenario template and keep the vitrage evaluator simple and generic. My two cents. [1]: http://docs.openstack.org/developer/vitrage/scenario-evaluator.html#concepts-and-guidelines On Sat, Jan 7, 2017 at 2:23 AM Afek, Ifat (Nokia - IL) <ifat.a...@nokia.com> wrote: > Hi YinLiYin, > > > > This is an interesting question. Let me divide my answer to two parts. > > > > First, the case that you described with Nagios and Vitrage. This problem > depends on the specific Nagios tests that you configure in your system, as > well as on the Vitrage templates that you use. For example, you can use > Nagios/Zabbix to monitor the physical layer, and Vitrage to raise deduced > alarms on the virtual and application layers. This way you will never have > duplicated alarms. If you want to use Nagios to monitor the other layers as > well, you can simply modify Vitrage templates so they don’t raise the > deduced alarms that Nagios may generate, and use the templates to show RCA > between different Nagios alarms. > > > > Now let’s talk about the more general case. Vitrage can receive alarms > from different monitors, including Nagios, Zabbix, collectd and Aodh. If > you are using more than one monitor, it is possible that the same alarm > (maybe with a different name) will be raised twice. We need to create a > mechanism to identify such cases and create a single alarm with the > properties of both monitors. This has not been designed in details yet, so > if you have any suggestion we will be happy to hear them. > > > > Best Regards, > > Ifat. > > > > > > *From: *"yinli...@zte.com.cn" <yinli...@zte.com.cn> > *Reply-To: *"OpenStack Development Mailing List (not for usage > questions)" <openstack-dev@lists.openstack.org> > *Date: *Friday, 6 January 2017 at 03:27 > *To: *"openstack-dev@lists.openstack.org" < > openstack-dev@lists.openstack.org> > *Cc: *"gong.yah...@zte.com.cn" <gong.yah...@zte.com.cn>, " > han.jin...@zte.com.cn" <han.jin...@zte.com.cn>, "wang.we...@zte.com.cn" < > wang.we...@zte.com.cn>, "jia.peiy...@zte.com.cn" <jia.peiy...@zte.com.cn>, > "zhang.yuj...@zte.com.cn" <zhang.yuj...@zte.com.cn> > *Subject: *[openstack-dev] [Vitrage] About alarms reported by datasource > and the alarms generated by vitrage evaluator > > > > Hi all, > > Vitrage generate alarms acording to the templates. All the alarms > raised by vitrage has the type "vitrage". Suppose Nagios has an alarm A. > Alarm A is raised by vitrage evaluator according to the action part of a > scenario, type of alarm A is "vitrage". If Nagios reported alarm A latter, > a new alarm A with type "Nagios" would be generator in the entity graph. > There would be two vertices for the same alarm in the graph. And we have > to define two alarm entities, two relationships, two scenarios in the > template file to make the alarm propagation procedure work. > > It is inconvenient to describe fault model of system with lot of > alarms. How to solve this problem? > > > > 殷力殷 YinLiYin > > > > > > > 上海市浦东新区碧波路889号中兴研发大楼D502 > D502, ZTE Corporation R&D Center, 889# Bibo Road, > Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203 > T: +86 21 68896229 <+86%2021%206889%206229> > M: +86 13641895907 <+86%20136%204189%205907> > E: yinli...@zte.com.cn > www.zte.com.cn > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev