Michael Richards wrote:

I've been reviewing the mmnormalize module. I am uncertain that it is
intended to do what I need, so I'll try to show some concrete examples and
answer some of your questions.

why are you thinking that mmnormalize will not work?

- pmcisco looks like it could have handled one use case, but I have the
impression rsyslog has evolved since its inception. As a parser, it also
matches cisco ASA logs which must not be modified due to downstream legacy
support. This explains why I would like to control processing by IP (or
subnet).

makes sense

- some logs need to be processed differently due to the sender, as some
projects have named all the devices the same and they need to be separated
by the device serial number found elsewhere in the entry.

The approach I took looks somewhat like this:

*set $.logsource = lookup('source_ip_list',$fromhost-ip);*
*if ($.logsource == 'cisco-sitea-switch') then {*
*  # doesn't work: call CiscoReparse*
*  action(type="mmnormalize" rulebase="/path/to/cisco-rewrite.rulebase")*
*  call ForwardToSIEM*
*  stop*
*}*

the workflow I would suggest is:

mmnormalize to parse the message into $! variables
set $.logsource
series of if then else statements to manipulate various $. variables then output via a custom template that uses the $! and $. variables that you have been manipulating

parsing is not supposed to change the original message (as I said, pmcisco is a hack in that it does so). parsing is just supposed to populate variables. It's the template system that creates the output, and templates can reference any variables, the built-in ones or the ones you create.


A sample log looks like this:
<123>12345: switch-floor12-zone34: Jul 22 08:09:10 EDT: %HA_EM-5-LOG:
record_mac_changes: action="delete" interface="TenGigEthernet/1/2/3"
mac_address="0123.4567.abcd"

I'd like it to be altered to look somewhat like this:
<123>1 2025-07-22T08:09:10.005Z  switch-floor12-zone34 - -
eventcat="HA_EM-5-LOG" script="record_mac_changes" action="delete"
interface="TenGigEthernet/1/2/3" mac_address="0123.4567.abcd"

I could most certainly whip up some regular expressions to beat these into
submission, but details matter, thanks to Cisco's omission of the year!

mmnormalize is MUCH faster than regex matching

Anyway, I hope this describes what I'm up to, and I'm sure you can easily
point out if the mmnormalize strategy will work.

sorry for such a quick/short answer that doesn't go into details, hopefully this helps and if not I can try to go into details later.

David Lang

thanks

Michael

On Tue, Jul 22, 2025 at 3:25 PM David Lang <[email protected]> wrote:

Michael Richards wrote:

In my case, I had a lookup table to determine the log type based on a
json
list of 500 or so different sending ips. Although parsed by default
chain,
the work has to be largely redone to get valid properties.

In this case, I think mmnormalize is the way to go.

the default parsing is fast, and mmnormalize is fast, so you should not
have
that much of a problem

200G of logs a day is an average of ~10k logs/sec (yes peak will be
higher).
This is not extreme for syslog. It does get up into the range where you
will
want to watch performance, but mmnormalize reparsing it should not be an
issue.



mmnormalize is very efficient with lots of parsing rules. You don't need
to
identify the type of rule and invoke the appropriate parser, you can put
all the
rules for all parsing in one ruleset (assuming the rules are specific
enough to
not have two that would parse the same message). At one prior job, I had a
parsing ruleset of 1400 patterns and the difference in performance between
a
message that matched the first rule in the ruleset and one that matched
the last
rule in the ruleset was only 30% (I did this testing after seeing
something
where a 30 line regex based ruleset was several thousand % slower on the
last
line than the first line)

There is a parser module that takes a mmnormalize ruleset and uses that to
parse
the ruleset.



With the initial parsing, you define a stack of parsers, and each one
checks the
message to see if it matches the pattern that it knows how to parse, if it
matches, it finalizes the parsing and other parsers aren't checked, if it
doesn't match, the next parser gets invoked.

But this can only look at the incoming message, not other things like a
table
lookup.

but if you can identify the messages by their content efficiently, this
will
work.

The pmcisco module that I wrote is actually a bit of a cheat, it doesn't
fully
parse the message, it just looks to see if it has the messageid instead of
the
expected syslog header data, if it finds it, it removes it and lets the
message
fall through to the next parser (it saved me having to duplicate all the
normal
parser functionality)

you currently are identifying the messages by their source ip, but can you
identify them by content?

David Lang


_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to