Re: [rsyslog] liblognorm

Rainer Gerhards Wed, 13 Oct 2010 06:29:11 -0700

I have put a copy of the liblognorm API doc (temporarily) up on:

http://www.gerhards.net/liblognorm/liblognorm_8h.html


Rainer

> -----Original Message-----
> From: [email protected] [mailto:rsyslog-
> [email protected]] On Behalf Of Rainer Gerhards
> Sent: Wednesday, October 13, 2010 2:45 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] liblognorm
> 
> Hi David,
> 
> as usual, many thanks for your great thoughts. I had a day of heavy
> hacking
> yesterday, thus the late response. See below...
> 
> > -----Original Message-----
> > From: [email protected] [mailto:rsyslog-
> > [email protected]] On Behalf Of [email protected]
> > Sent: Monday, October 11, 2010 10:33 PM
> > To: rsyslog-users
> > Subject: Re: [rsyslog] liblognorm
> >
> > On Mon, 11 Oct 2010, Rainer Gerhards wrote:
> >
> > > I have just written another post on the normalization library. It
> > looks like
> > > the design tends to favor a split into two libraries:
> > >
> > > http://blog.gerhards.net/2010/10/splitting-up-normalization-
> > library.html
> >
> > this seems like a good idea.
> >
> > there is a definate need for a good, efficient parsing tool that can
> be
> > used for high volume sites. There are a lot of tools that heavily use
> > regex matching, but those tend to collapse at high volumes.
> >
> > you can create your own parser with lex, yacc, bison, or flex, but th
> > work
> > needed to create the input config file for these (with their specific
> > syntax) is daunting.
> >
> > a tool that could take it's configuration in something that looks
> very
> > similar to log lines (with some sort of syntax to show the variable
> > part),
> > that would then compile into something very effient like the tools
> > above
> > would be very useful for a lot of different tools.
> 
> That's the basic idea, instead that I do not intend to create e.g. lex
> source
> but rather have the engine do that part itself. The main advantage is
> that
> this could be done dynamically. I think this will be possible in almost
> constant time, as long as all fields can be parsed via primitive types
> (which
> do not require too much effort to back off).
> 
> I will be working on the parse tree the next days, so you'll hopefully
> be
> able to get an idea of it by looking at the code. At it's heart, it
> simply is
> a radix tree, with constants and field syntaxes definig how the tree is
> traversed.
> 
> >
> > this may just need to be a configuration generator for the tools
> listed
> > above that can take the list of annotated lines and create the
> > appropriate
> > config file to build the parser. If this can accept regex lines and
> > then compile them down to a parser tree it would be wonderful.
> 
> Regex is a different beast, because for it you need to create a full-
> blown
> DFA, which also explains the slowness of regexes. I'll not tackle that
> beast.
> For some fields, I will support regex matches, but when they are used,
> performance is affected. The overall idea is that you usually do not
> need any
> regex/DFA at all.
> 
> >
> > so once there is a high performance parser to pull the data apart,
> then
> > the question is what to do with it.
> >
> > some people will want to write it to various places, others will want
> > to
> > make decisions based on what is matched.
> >
> > for those who are wanting to write the normalized output to various
> > places, a plugin structure like rsyslog has (with the ability to
> format
> > the messages based on the various properties that are discovered) is
> > very
> > appealing, and it may make a lot of sense to see what can be done to
> > re-use that work. If so, there will need to be a 'format string' that
> > creates the output with all the properties that are known tagged, but
> > without including ones that didn't have any matches in this log
> > message.
> 
> One thing I definitely intend to do is utilize the library in rsyslog.
> I
> envision a parser module that works based on the library. That also
> means
> rsyslog's core engine must be extended to support the additional
> fields, but
> something that can definitely be done. With that approach, no complex
> output
> engine is needed - one can just use the rsyslog plugin. And with a near
> O(1)
> algorithm, we can probably expect that this happens in real-time even
> for
> very large traffic loads (but probably not for the largest ones).
> 
> It is important to know here that the current parsers also have some
> limited
> backout needs, for example for the date and tag/hostname fields. So
> this can
> be done quickly.
> 
> >
> > for those who are wanting to then implement logic based on what it
> > gets,
> > thing get much more interesting. I suspect that the thing to do here
> > will
> > be to make the event normalization engine be something that can be a
> > library included in other programs (in various languages), something
> so
> > that you can have the config file be something along the lines of
> 
> that's actually the idea. An initial sketch of the API is already in
> git and
> I hope to get some better-readbly doxgen-generated interface spec up
> later
> today.
> >
> > documentation (hopefully including a sample raw line)
> > line-to-match
> > function to call when matched
> >
> > there are a log of programs out there written to do good and
> > interesting
> > stuff with lines that it receives, if there was an ability to replace
> > their sequential 'does it match rule 1, does it match rule 2' logic
> > with a
> > more efficient parser it would be a huge win.
> >
> > I don't think you are wanting to tackle that portion of the task.
> 
> The lib part yes, but that obviously requires a lot of changes for
> applications using it.
> 
> Rainer
> >
> > David Lang
> >
> >
> > > Rainer
> > >
> > >> -----Original Message-----
> > >> From: [email protected] [mailto:rsyslog-
> > >> [email protected]] On Behalf Of Rainer Gerhards
> > >> Sent: Monday, October 11, 2010 9:01 AM
> > >> To: rsyslog-users
> > >> Subject: Re: [rsyslog] liblognorm vs. libeventnorm
> > >>
> > >> I would like to add as an argument pro liblognorm, that many
> people
> > >> probably
> > >> better understand what "log normalization" is whereas "event
> > >> normalization"
> > >> may sound strange. In that sense, liblognorm may be a better name.
> > >> Feedback
> > >> is appreciated.
> > >>
> > >> Rainer
> > >>
> > >>> -----Original Message-----
> > >>> From: [email protected] [mailto:rsyslog-
> > >>> [email protected]] On Behalf Of Rainer Gerhards
> > >>> Sent: Sunday, October 10, 2010 11:53 AM
> > >>> To: rsyslog-users
> > >>> Subject: [rsyslog] liblognorm vs. libeventnorm
> > >>>
> > >>> Hi all,
> > >>>
> > >>> I think I'll start with the libeventnorm name for the normalizing
> > >>> library
> > >>> instead of liblognorm. Reason here:
> > >>>
> > >>> http://blog.gerhards.net/2010/10/liblognorm-or-libeventnorm.html
> > >>>
> > >>> Further name suggestions or arguments are very welcome!
> > >>>
> > >>> Rainer
> > >>> _______________________________________________
> > >>> rsyslog mailing list
> > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >>> http://www.rsyslog.com
> > >> _______________________________________________
> > >> rsyslog mailing list
> > >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >> http://www.rsyslog.com
> > > _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > > http://www.rsyslog.com
> > >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Re: [rsyslog] liblognorm

Reply via email to