On Thu, Jan 26 2017, Afek, Ifat (Nokia - IL) wrote: > I’ll try to answer your question from a user perspective.
Thanks for your explanation, it helped me a lot to understand how you view things. :) > Suppose a bridge has a bond of two physical ports, and Zabbix detects a signal > loss in one of them. This failure has no immediate effect on the host, > instances or applications, and will not be reflected anywhere in OpenStack. > > Vitrage will receive an alarm from Zabbix, identify the instances that will be > affected if the entire bond fails, and create deduced alarms that they are at > risk (if the other port fails they will become unreachable). Similarly, it > will > create alarms on the relevant applications. So when you say "create deduced alarms"… What does it mean? I understand the deduction, but I am not sure what it "creates" – 'cause then you say: > A user that checks Aodh will see that all alarms are in ‘ok’ state, which > might > be misleading. Which alarms? Could you be more precise? Where these alarms come from? Are they created by the users or by Vitrage automatically? If it's a CPU usage of its instance there's no reason for it to become red. If I recall correctly what you explained to me a while back, there are alarms created by Vitrage based on some rules, so I imagine these are the ones you talk about? > The user might determine that everything is ok with the instances that > Aodh is monitoring. If the user then checks Vitrage, he will see the > deduced alarms and understand that the instances and the applications > are at risk. From what I understood the user can't really check Vitrage (IIRC it does not really have a full API for users yet), right? > Does it make sense that the user will check Aodh *and* Vitrage? A standard > user > would like to see all of the alarms in one place, no matter which monitor was > responsible for triggering them. Yes: it does make sense for the user to check both because of the way Aodh+Vitrage are architectured right now. Does it make sense in term of user experience? I think we both agree that no it does not. Having a central place of alerting would be awesome. But does it make sense to force-fed Vitrage alarms and data model in Aodh? I am not sure right now. If I circle back again to UX, when a user requests Aodh, it only sees alarm he created and he managed. With generic alarms, the way it's pushed right now, there's going to be a bunch of generic thing the user has barely any clue about that can do things he has no idea – because it can't really do anything on Vitrage. And even if Vitrage had an API to manipulate the rules and all (I can easily imagine it's in the roadmap) that means it would manipulate deduction rules on the Vitrage API and then see things magically happen into his Aodh account. I find that… weird. It sounds a lot prone to failure and out-of-async between Aodh and Vitrage. Let's imagine another scenario/solution (which I am *not* advocating, it's just an exercise for thought): Vitrage would store its alarms (defined and created bases on its rules) in a database. It would then offer an access to it to Aodh (e.g. via an HTTP API). Then Aodh could query it. For example, when a user would ask Aodh to list the alarms, Aodh will return the alarms that are store in its own database (created by the user) and would also query Vitrage to return the list of alarms created by Vitrage rules (and their deducted state). What's the point of such a design? Well it's less prone to out-of-sync-ness and does not force any data model in Aodh that it has no use for. It also solves the problem of "having a central listing of alarms" for the user – the user does not have to be aware of Vitrage. Is it a good technical design? Probably not. It seems weird to make Aodh a bridge to Vitrage. And I think that's the whole thing I am not liking from the current proposal and the one I just invented. The way Aodh and Vitrage are bridged, the way Vitrage is built on top and outside of Aodh right now feels wobbly to me. So here's another question then: why wouldn't there be a "zabbix" alarm type in Aodh that could be created by a user (or another program) and that would be triggered by Aodh when Zabbix does something? Which is something that is really like the event alarm mechanism which already exists. Maybe all that's missing is a Zabbix-to-OpenStack-notification converter to have that feature? I'll stop that for now to let you reply or my mail is going to be way too long lol. > And a side note – you said that Aodh and Zabbix are exactly the same. I agree. > You can implement in Aodh everything that is implemented in Zabbix. But why do > that instead of just using that alarms that are already created by another > monitor? Oh no point, I was just making a point to be sure we were on the same line in term of understanding, and it seems we are. :) > Well… is this awesome enough? ;-) Yes thanks, I think this is a good example that will help us thinking in term of UX what we want to build and how we want to build it. -- Julien Danjou # Free Software hacker # https://julien.danjou.info
signature.asc
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev