On Tue, 30 Dec 2014, Kendall Green wrote:
Hello and thank you for the example configuration for reparse(), which
would help normalization efforts for parsing windows event messages for
different event ids.
How would the refactoring of reparse affect the new RainerScript functions
warp() and replace(), and also mmnormalize can also work with variable, and
are there any example configuration for these functions.
At my last job I used mmnormalize on variables extensively, I took the message
that arrived in JSON and created a 'standard' traditional syslog formatted line
(no matter what format it arrived in) and then did the mmnormalize on that line.
This included logs from windows systems
However, keep in mind that mmnormalize scales very well with the rule size, so
you can have as many different rules in one file as you want.
Doing windows logs again, I would create a variable that had the event id as one
of the early fields in the line (right after the prefix of timestamp, hostname,
sourcename) which will let mmnormalize immediatly identify the correct parser
rule(s) to use and then parse the message appropriately with no chance of
confusing it with a different message type.
The re_extract()
function warning message that it is being deprecated, so I resorted to
using legacy template syntax from the regex generator tool, and assigning
to variable with the exec_template() function. This allows regex to be
used, without putting into the mmnormalize rulebase, but different regex
would be necessary to reparse the subsection of the messages, for
conditions on contents or different events. The spacing depending on the
contents of the message, or event states that affect the formatting
changes, as another challenge, these messages currently have syslog agents
that loose the structure, but can be restored with possibly wrap() and
replace(), to reformat the message parts and mmjsonparse or reparse the key
values into different format, key=value or json, cee, cef, csv. Using
rulebase that is able to define most sections of the message, but the
"iptables" type key=value pairs is limited.
the repeat functionality can help you here, but I agree we need a more general
key-value and csv type capability (look at what nxlog does for csv for a pretty
flexible example)
This awesome feature would
provide so much ability if could specify separators, or way to define the
key:value wrappers. Can anyone speak to the ability in example use cases
for new features of mmnormalize? Would output name to separate json paths
logically circumvent caveats regarding unknown results from executing
multiple instances of different rulebases against a message, and how
might reparse() might be able to take into account?
I'm not sure I understand your question here, can you try to restate it (also,
when you have a long e-mail like this with many questions, separating them out
can help avoid confusion)
Another odd thing with working with dynafiles and variables, is
that $!vars appears to only work with lowercase letters, so the rulebase
variables that are uppercase and used as output in the omfile name need to
be set to another variable that is lowercase or it doesn't populate the
outfile name. When dealing with tens of thousands of clients, it doesn't
lend much to changing anything about what comes into the central logging
service. The raw data output by windows is with nested structures
defined by tabs, character return, new lines, that are replaced with 4, 3,
or 2, spaces:
what version are you using? we had some problems with capitalization not too
long ago.
"An account was logged off.\r\n\r\nSubject:\r\n\tSecurity
ID:\t\tS-1-5-21-1343760832-931058557-1943201436-1000\r\n\tAccount
Name:\t\tkgreen\r\n\tAccount Domain:\t\tdell\r\n\tLogon
ID:\t\t0x86c35c\r\n\r\nLogon Type:\t\t\t7\r\n\r\nThis event is generated
when a logon session is destroyed. It may be positively correlated with a
logon event using the Logon ID value. Logon IDs are only unique between
reboots on the same computer."
Windows Syslog Agent sends as:
"An account was logged off. Subject: Security ID:
S-1-5-21-1343760832-931058557-1943201436-1000 Account Name: kgreen
Account Domain: dell Logon ID: 0x86c35c Logon Type: 7 This
event is generated when a logon session is destroyed. It may be positively
correlated with a logon event using the Logon ID value. Logon IDs are only
unique between reboots on the same computer."
what windows syslog agent are you using? I was using nxlog, which let me forward
the eventlog data as a JSON message, which gave me both the message as you are
describing and a lot of the fields (especially standard ones like eventid)
broken out as separate json objects.
The disadvantages in mmnormalize, is for the msg object is included as part
of the json structure after the contents have been parsed to fields,
essentially duplicating into structured data containing an unstructured
mess of a log.
This goes back to a previous message topic where the question of which
properties to include, which I think the option of all properties to
include, defined in a template, for input and output properties and
constants, of mmnormalizer to use for only message rather than msg or
userawmsg. I assume the new feature to define a json path, and for
templates to output to fieldname is related to this. Any illustration is
appreciated.
again, a lot of questions here :-)
one thing you can do is tell mmnormalize where to store it's output (say under
$!parsed instead of just $!), you can then remove fields that you consider
duplicates.
Looking into workaround for outputting only the json paths related to the
parsed logs, without needing to know the name of every field to define, or
any option to omit a field (%msg/$msg) from the $! path, or output
$!all!objects, as it makes sense to have the json path be
%$!Subject!Security ID% %$!Subject!Account Name% for example... output by
%!Subject%. I want to experiment more with the json paths and variables, as
well as literal types in rulebase to find the best way to go about windows
event log normalization, but most of all feedback and community insight is
most appreciated on this topic.
I will forward you the rulebase I had created off-list (it's rather large for
the list)
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.