Re: [rsyslog] Regex logging to MongoDB

david Wed, 22 Aug 2012 12:22:21 -0700

First off, let me say tht I'm not very familar with mongodb, so it maywork differently then I am thinking it does.


David Lang


On Wed, 22 Aug 2012, Miloslav Trmac wrote:

----- Original Message -----
On Wed, 22 Aug 2012, Miloslav Trmac wrote:
- I this approach reasonable?  The problem with this is that the "field"
treatment of the template is so different from other cases; there is a
precedent with omoracledb's use of ...AS_ARRAY, but that's only a single
module.
Why have your own template engine instead of using the normal rsyslog
template engine?
Primarily I was considering the use case of modifying $!all-json - i.e.store any incoming record in full, but add an "incoming host name"(overriding any "incoming host name" value in the record). I don'tthink this can be done with pure textual substitution withoutunderstanding the field structure. (This is not specific to mongodb -using a textual long file in JSON format would require similarunderstanding of the field data.) Or perhaps this kind of functionalityisn't useful, or it could be done in a different way?

Why should you override whatever the sysadmin has configured? they maywant to add the "incoming host name" to the record and they may not. Whyshould your om always log something instead of being like every other omand logging what the sysadmin tells you to log?

Secondarily, this allows building the mongodb inputs without having todo a fairly expensive field list -> JSON data -> JSON text -> JSON data-> BSON roundtrip, but that is a secondary concern.

I'm not sure I really understand the idea of reading log files from adatabase as an input source. But in any case, why should anything relatedto input have an effect on outputs?

note that JSON data is always text, BSON has binary representations offields, but not JSON.

Using the raw "sequence of fields without any formatting" format is notgreat, I agree - but then pretending that the template can be anarbitrary JSON format and we parse it intelligently is not great either.However that's definitely open to a change.

I thought I saw you saying that you wanted to send JSON to the database.If that is the case, then let the sysadmin create the JSON and insertthat.

If you are wanting something other than JSON to put into the database,it's still probably better for the sysadmin to be able to specify thingsand you just do what the user wants.


It's common for people to want to have rules along the lines of

(not rsyslog syntax)
if sourceip in $list then set tag X in output log to a fixed value

if you end up creating your own template language

The users are already familar with, and that gets extended to cover new
things without you having to duplicate the effort. This includes the
ability to dump all properties out as JSON and is going to include
abilities to modify the fields and leave some out of the output in the
future (as part of the entire lumberjack related effort)

This sounds interesting, is there any code I could look at?

start by looking at the existing json formatting options for templates,but the key is that every other om uses a text string to pass the log datafrom rsyslog to the module, some of them interpret parts of the string andstrip parts off before sending the remainder to the destination (the UDPforgery module for example), others take the data passed in to them andwrap it in other needed stuff to send it out (the email module forexample), but I don't know of any that ignore the text string passed tothem and create the message directly from the properties.

- What to do about non-string values?  mongodb recognizes different
types, and it would be good to use the native one (so that numbers
instead of strings could be compared - note that there is no automatic
type conversion when comparing different types in mongodb).  Non-string
types can be handled on output (by adding
"format-as-int"/"format-as-date" options), but AFAICS on input /
mmjsonparse everything is treated as strings.  Is it at this point
realistic to think about preserving the type of data as presented on
input while reformatting it (e.g. by using mmjsonparse, and a template
with $!all-json and some of the above-mentioned field "edits")?  Or is
rsyslog so fundamentally based on strings that this would take too much
work?  (There is always the option to preserve types simply by treating
the JSON as unmodified plaintext).


Remember that the input to rsyslog is strings to start with (with the
exception of internally generated metadata). To get it to be anything
other than a string is going to require converting it. There's nothing
preventing you from getting a JSON string from rsyslog and optimizing it
by converting the data from strings to a more compact format for storage.
Output modules to transport the data from system to system via JSON will
be doing exactly the same thing.

The case I was thinking about is the same as above - keep the originalJSON, but add or modify a little.


So, if the input event is

@cee: {"field1": "string", "field2": 5.0, "field3": [1,2,3]}

I would like the data stored in MongoDB to directly correspond, e.g.

{"host_name":"server1", "field1": "string", "field2": 5.0, "field3": [1,2,3]}

not modify it to, say,

{"host_name":"server1", "field1": "string", "field2": "5.0", "field3": 
"[1,2,3]"}

and what makes you think that rsyslog is going to change it instead ofkeeping it the same?

however, I suspect that the incoming @cee formatted message is going tohave all fields quoted.

MongoDB documentation seems to suggest that it doesn't support comparingmixed field types much, so changing the value types sounds undesirable.

you don't need to have your own template language to do this, let thesysadmin specify the format (go ahead and have a default format if thesysadmin doesn't specify one).


David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards

Re: [rsyslog] Regex logging to MongoDB

Reply via email to