-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John P. Rouillard wrote:
| In message <[EMAIL PROTECTED]>,
| Hugo van der Kooij writes:
|> I have events sets that consists of 4 lines that will be written over
|> the space of several seconds (at this moment about 45 to 60 seconds).
|> They consist of 4 events and I want to define 1 action on the 4th event.
|>
|> For example part of the data:
|>
|> VERIFY 194.151.25.153 1200237982 START 1200237982
|> VERIFY 194.151.25.153 1200237982 CONNECTED 1200237982
|> VERIFY 194.151.25.153 1200237982 SEND 1200238013
|> VERIFY 194.151.25.153 1200237982 RECEIVED 1200238035
|
|> VERIFY 194.151.25.153 1200238201 START 1200238201
|> VERIFY 194.151.25.153 1200238201 CONNECTED 1200238201
|> VERIFY 194.151.25.153 1200238201 SEND 1200238232
|> VERIFY 194.151.25.153 1200238201 RECEIVED 1200238248
|
|> VERIFY 194.151.25.153 1200238501 START 1200238501
|> VERIFY 194.151.25.153 1200238501 CONNECTED 1200238501
|> VERIFY 194.151.25.153 1200238501 SEND 1200238532
|> VERIFY 194.151.25.153 1200238501 RECEIVED 1200238550
|>
|> The 3rd and 5th column are unixtime.
|>
|> To make matters more complicated I might get intersected events from
|> other hosts in here as well. So then the logs will look much more
|> complicated.
|>
|> Each event set is in fact unique and one can think of the IP address
|> (2nd column) and the ID (3rd column) as the keys to find unique sets.
|
| Good info to have we will use it below.
|
|> I want to learn on each RECEIVED event the time difference between the
|> CONNECTED and SEND events and the SEND and RECEIVED events.
|>
|> I also want to know if there was no RECEIVED event after a CONNECTED
|> event within N seconds (N being about 2 to 5 minutes).
|>
|> Some thing like: Host 194.151.25.153 took 32 seconds to receive and 28
|> seconds to process event 1200238501
|
| I will leave producing the exact output you want to the reader, but a
| ruleset the matches your data could look something like:
|
|   # rule 1
|   type=single
|   desc= detect start of sequence for host $1 id $2
|   ptype=regexp
|   pattern=VERIFY ([0-9.]*) ([0-9]*) START ([0-9]*)
|   rem= may need to set a max tine on this context
|   action=create context_$1_$2; add context_$1_$2 $0
|
|   # rule 2
|   type=single
|   desc= detect connected part of sequence for host $1 id $2
|   ptype=regexp
|   pattern=VERIFY ([0-9.]*) ([0-9]*) CONNECTED ([0-9]*)
|   rem= match this rule only if a START was seen prior
|   rem= otherwise something is wrong and we need to punt.
|   context= context_$1_$2
|   rem = store connected time in a context for later use
|   action=add context_$1_$2 $0; add context_connected_time_$1_$2 $3
|
|   # rule 3
|   type=single
|   cont=takenext
|   desc= detect send part of sequence for host $1 id $2
|   ptype=regexp
|   pattern=VERIFY ([0-9.]*) ([0-9]*) SEND ([0-9]*)
|   rem= match this rule only if a START was seen
|   context= context_$1_$2
|   action=copy context_connected_time_$1_$2 %{connectedtime}; \
|        eval %{diff} =($3 - %{connectedtime}; \
|        add context_$1_$2 $0; \
|        add context_$1_$2 CONNECTED to SEND time %{diff} seconds; \
|        add context_connected_time_$1_$2 $3
|
|   # rule 4
|   type=pairwithwindow
|   desc = detect RECEIVED for host $1 id $2 or detect missing received
|   context= context_$1_$2
|   ptype=regexp
|   pattern=VERIFY ([0-9.]*) ([0-9]*) SEND ([0-9]*)
|   context= context_$1_$2
|   ptype2=regexp
|   pattern2=VERIFY ([0-9.]*) ([0-9]*) RECEIVED ([0-9]*)
|   rem = report missing received after 3 minutes (180 seconds)
|   window =  180
|   rem = no RECEIVED event
|   action= report context_$1_$2 mail -s "failed to get a RECEIVED entry
for host $1 id $2. Log is"
|   rem=received is found $3 is the time extracted using "pattern2",
|   rem=%3 is the time extracted from the send line using "pattern"
|   action=eval %{diff} =($3 - %3); \
|        add context_$1_$2 $0; \
|        add context_$1_$2 SEND to RECEIVED time %{diff} seconds; \
|        report context_$1_$2 mail -s "Report on host $1 id $2 connection"
|
|
| may work (untested). I usually get the algorthm in the right ballpark,
| but get the exact syntax wrong. It should produce output like:
|
|   VERIFY 194.151.25.153 1200238501 START 1200238501
|   VERIFY 194.151.25.153 1200238501 CONNECTED 1200238501
|   VERIFY 194.151.25.153 1200238501 SEND 1200238532
|   CONNECTED to SEND time 31 seconds
|   VERIFY 194.151.25.153 1200238501 RECEIVED 1200238550
|   SEND to RECEIVED time 18 seconds
|
| when received is seen and output missing the last two lines when
| received line is missing.
|
| This is sort of a showcase and uses two different methods of
| extracting the times you want to subtract. In production I would
| probably use the pairwithwindow mechanism for doing the math.
|
| Look at rule 4, it matches the two lines that have data you want to
| subtract. The pattern regular expression matches the SEND line and
| extracts the timestamp into the subexpression $3. Then pattern2
| matches the RECEIVED line if it occurs within window seconds (3
| minutes/180 seconds). Pattern2's third subexpression also extracts the
| timestamp, but of the received line and puts in into $3. Because
| pattern2 has subexpressions, the original value of $3 is assigned to
| the variable %3. Then we use an eval and a perm expression to do the
| math. The some add to the context to put the data into something that
| we can report. Then finally do the report.
|
| To calculate the CONNECTED-SEND interval, rule 2 captures the
| timestamp into $3. Create a context that holds this value for later
| use by rule 3.
|
| In rule 3, we extract the timestamp for the CONNECTED event into a
| variable %{connectedtime} and use eval to do the math taking the time
| fo the SEND event from $3. Then add info into the context. Also note
| the value of cont. It must be takenext so that the send event that
| triggers the single rule is passed to the pair with window rule to
| trigger it.
|
| In rule 1, we create the context_$1_$2 using the hostname and id to
| make the context unique. This way you can have multiple outstanding
| event streams and they will all stay seperate.
|
| Also note that the description for all 4 rules includes the hostname
| and ID. For single rules, it doesn't matter that much, but for rules
| that start a correlation operation (pairwithwindow, pair, *threshold*)
| a unique description is required to differentiate between the chains
| of events.
|
| E.G. if I had the event stream:
|
| ...
|   VERIFY 194.151.25.153 1200238501 SEND 1200238532 (1)
|   VERIFY 194.151.25.156 1200238503 SEND 1200238532 (2)
|   VERIFY 194.151.25.156 1200238503 RECEIVED 1200238550 (3)
|
| and rule #4's description was:
|
|   desc = detect RECEIVED or detect missing received
|
| you would get a single correlation started for event 1. Event 2 would
| be ignored because it triggers the same correlation (since there is
| not host/id in the description, event 2 would create exatclty the same
| named corelation). The event 3 comes along and is matched by pattern2
| in rule 4 and ends rule 4. Of course it end incorrectly since it
| should match event 1 and not event 2 and the time calculation will be
| wrong.
|
| I am sure Risto or some of the other readers will come up with other
| ways to do this.

I basically studied wat you tried to do and rewrote the lot to suit my
needs with Nagios. And this is what it looks like:

# Rule 1:
#       VERIFY <IP> <ID> START <timestamp>
#       (ID = timestamp)

type = Single
continue = TakeNext
desc = Detect start of test $2 for host $1
ptype = RegExp
pattern = VERIFY ([0-9.]*) (\d*) START (\d*)
action = \
~        create context_$1_$2 900; \
~        add context_$1_$2 START $3


# Rule 2:
#       VERIFY <IP> <ID> CONNECTED <timestamp>

#type = Single
type = PairWithWindow
window = 30
continue = TakeNext
continue2 = TakeNext
desc = [$3] PROCESS_SERVICE_CHECK_RESULT;$1;email-loopback;2;Failed to
establish the connection
desc2 = Wait for rule 3
context = context_$1_$2
ptype = RegExp
ptype2 = RegExp
pattern = VERIFY ([0-9.]*) (\d*) START (\d*)
pattern2 = VERIFY ([0-9.]*) (\d*) CONNECTED (\d*)
action = \
~        write /var/log/nagios/rw/nagios.cmd %s; \
~        delete context_$1_$2;
action2 = \
~        add context_$1_$2 CONNECTED $3; \
~        create context_connected_$1_$2 900; \
~        add context_connected_$1_$2 $3


# Rule 3:
#       VERIFY <IP> <ID> CONNECTED <timestamp>
#       VERIFY <IP> <ID> SEND <timestamp>

#type = Single
type = PairWithWindow
window = 90
continue = TakeNext
continue2 = TakeNext
desc = [$3] PROCESS_SERVICE_CHECK_RESULT;$1;email-loopback;2;Failed to
transmit loopback message after the connection was established
desc2 = Wait for rule 4
context = context_$1_$2
ptype = RegExp
ptype2 = RegExp
pattern = VERIFY ([0-9.]*) (\d*) CONNECTED (\d*)
pattern2 = VERIFY ([0-9.]*) (\d*) SEND (\d*)
action = \
~        write /var/log/nagios/rw/nagios.cmd %s; \
~        delete context_$1_$2; \
~        delete context_connected_$1_$2;
action2 = \
~        copy context_connected_$1_$2 %a; \
~        eval %b ($3 - %a); \
~        add context_$1_$2 SEND $3 time take = %b seconds; \
~        create context_send_$1_$2 900; \
~        add context_send_$1_$2 $


# Rule 4:
#       VERIFY <IP> <ID> SEND <timestamp>
#       VERIFY <IP> <ID> RECEIVED <timestamp>

type = PairWithWindow
window = 180
desc = [$3] PROCESS_SERVICE_CHECK_RESULT;$1;email-loopback;2;Failed to
receive loopback message after the message was send successful
desc2 = [$3] PROCESS_SERVICE_CHECK_RESULT;$1;email-loopback;0;
context = context_$1_$2
ptype = RegExp
ptype2 = RegExp
pattern = VERIFY ([0-9.]*) (\d*) SEND (\d*)
pattern2 = VERIFY ([0-9.]*) (\d*) RECEIVED (\d*)
action = \
~        write /var/log/nagios/rw/nagios.cmd %s; \
~        delete context_$1_$2; \
~        delete context_connected_$1_$2; \
~        delete context_send_$1_$2;
action2 = \
~        copy context_connected_$1_$2 %a; \
~        copy context_send_$1_$2 %b; \
~        eval %d (%b - %a); \
~        eval %e ($3 - %b); \
~        write /var/log/nagios/rw/nagios.cmd %s Roundtrip time is %e
second(s); \
~        delete context_$1_$2; \
~        delete context_connected_$1_$2; \
~        delete context_send_$1_$2;


That is how I got it solved. If any step is not taken in time then it
will trigger an alert and delete everything I created as far as context
settings are concerned.

Hugo.

- --
[EMAIL PROTECTED]               http://hugo.vanderkooij.org/
PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc

        A: Yes.
        >Q: Are you sure?
        >>A: Because it reverses the logical flow of conversation.
        >>>Q: Why is top posting frowned upon?

Bored? Click on http://spamornot.org/ and rate those images.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHk7PEBvzDRVjxmYERAsEgAJ0VH1NVFdkCnD5qPudyuUmcfKSqYACcCdAk
cvwHAx4v0vi36TvO4uiXDKA=
=JURc
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Simple-evcorr-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to