On 24 Jul 2009, at 06:38, M. selcuk karaca wrote:
Thanks for your insights.. I recognize your tremendous experience and
knowledge..
What we know makes our world rich but also it puts limits in our
minds..
Fair enough (I'm always open to a bit of flattery) :)
I'll try and keep an open mind, so my questions here are
investigative, to see how you would solve this problem.
I know End-to-End (E2E) testing is invaluable.. You can easily see
bussiness
side and get SLAs..
But there is 1 drawback:
They can only inform that service is unavailable or slow. But does not
reveal which component is causing this.. !
With service tree this is a child game.
I think the requirements you are asking for is:
* an overall view of a business service, with a health indication
based on some complex logic to determine the "actual" state
* a click on this business service to then show the state of the
clusters involved in this business service (web servers, app servers,
db servers, 3rd party)
* a click on the clusters (or all) to see all the components
(individual services)
* (I assume 3 levels of hierarchy is sufficient at the moment)
* maybe in addition, a top level view of all "business services"
So this would satisfy your customers (who care about the top level
state of the business service), and satisfy your technical support
staff (who want to see the state of all the components comprising of
the business service and not other unrelated "things").
Is that right?
Assuming this is correct, how would you configure this? I can see
you'd need to find:
* a way of grouping the services together (component level)
* a way of representing the summarised component level (perhaps
some logic about clustering)
* a way of grouping these summarised components together (cluster
level)
* a name for these summarised components (business level)
Where and how would you add these into the current Opsview interface?
What changes are required at the Nagios level? What kind of
authorisation would be needed for the components, cluster and business
level views? How do you report on these? Would this work in a
distributed environment?
(BTW, those are questions I have to answer for every new piece of
functionality!)
Ton
_______________________________________________
Opsview-users mailing list
[email protected]
http://lists.opsview.org/lists/listinfo/opsview-users