Check ITIL documentation for the proper definition of incident, problem, etc.

In regard to "what is important"... that is a harder question.  It depends on 
your device mix, redundancy levels, etc.

I would approach it in this manner:
1.  Is all the noise out of the environment?  Do I have flapping (i.e. up/down) 
interfaces?  How many/how much?
2.  Do I have a lot of duplicate-events (after defining what a duplicate is)?  
How many/how much?
3.  Can I abstract any events out of a given number of like events (poor man's 
correlation)? How many/how much?
4.  Are there any correlation scenarios that I can determine the patterns for?  
How many/how much?
5.  Once you figure out the types of event-reduction, abstractions, and 
correlations you have, then determine the total number of events you can reduce 
for operations, assign a man-hour value per event (e.g. 10 minutes), and 
calculate the total man-hours you can save for reducing events in each 
category.  This can be equated to a dollar value.

Bottom-line... managers like to save money.  Give them a real (or real enough) 
dollar-savings figure you can wave around.

The trouble is... it'll take you going through event-logs looking for patterns 
on your own, but that shouldn't be that hard.

After I showed success in those areas, then I'd tackle service/business 
correlation -- but not before.

That's how I'd start (i.e. start with the thing managers understand -- saving 
money !!!)

Gary Boyles, Intel


-----Original Message-----
From: Tim Peiffer [mailto:peif...@umn.edu]
Sent: Monday, September 10, 2012 6:03 PM
To: simple-evcorr-users@lists.sourceforge.net
Subject: [Simple-evcorr-users] articulating the need for discussion of what is 
important.

I am having some issues trying to get buy-in for event correlator
operations.  I want to engage systems owners and operators in such a way
that they define what is important, what is work and what is an incident
or a trouble ticket.  I want to have them articulate how to combine log
events to provide for a systems based business intelligence.  I am a
firm believer, but I need to articulate something to convince people
that would otherwise not be involved, because they think this is the job
for the network management platform.

Down events followed by up events are OK by themselves, but multiple
cycles or bounces indicate an issue that just isn't going away.

Does anyone know of a document that discusses event correlation in
general, and particularly in how to look at at logs to determine what is
important, and how things group together to provide effective event
handling and consolidation?  So how does one articulate the need to
define what is an important notice, what is work, and what defines an
incident?

--
Tim Peiffer
Network Support Engineer
Office of Information Technology
University of Minnesota/NorthernLights GigaPOP

+1 612 626-7884 (desk)


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to