I have put a copy of the liblognorm API doc (temporarily) up on: http://www.gerhards.net/liblognorm/liblognorm_8h.html
Rainer > -----Original Message----- > From: [email protected] [mailto:rsyslog- > [email protected]] On Behalf Of Rainer Gerhards > Sent: Wednesday, October 13, 2010 2:45 PM > To: rsyslog-users > Subject: Re: [rsyslog] liblognorm > > Hi David, > > as usual, many thanks for your great thoughts. I had a day of heavy > hacking > yesterday, thus the late response. See below... > > > -----Original Message----- > > From: [email protected] [mailto:rsyslog- > > [email protected]] On Behalf Of [email protected] > > Sent: Monday, October 11, 2010 10:33 PM > > To: rsyslog-users > > Subject: Re: [rsyslog] liblognorm > > > > On Mon, 11 Oct 2010, Rainer Gerhards wrote: > > > > > I have just written another post on the normalization library. It > > looks like > > > the design tends to favor a split into two libraries: > > > > > > http://blog.gerhards.net/2010/10/splitting-up-normalization- > > library.html > > > > this seems like a good idea. > > > > there is a definate need for a good, efficient parsing tool that can > be > > used for high volume sites. There are a lot of tools that heavily use > > regex matching, but those tend to collapse at high volumes. > > > > you can create your own parser with lex, yacc, bison, or flex, but th > > work > > needed to create the input config file for these (with their specific > > syntax) is daunting. > > > > a tool that could take it's configuration in something that looks > very > > similar to log lines (with some sort of syntax to show the variable > > part), > > that would then compile into something very effient like the tools > > above > > would be very useful for a lot of different tools. > > That's the basic idea, instead that I do not intend to create e.g. lex > source > but rather have the engine do that part itself. The main advantage is > that > this could be done dynamically. I think this will be possible in almost > constant time, as long as all fields can be parsed via primitive types > (which > do not require too much effort to back off). > > I will be working on the parse tree the next days, so you'll hopefully > be > able to get an idea of it by looking at the code. At it's heart, it > simply is > a radix tree, with constants and field syntaxes definig how the tree is > traversed. > > > > > this may just need to be a configuration generator for the tools > listed > > above that can take the list of annotated lines and create the > > appropriate > > config file to build the parser. If this can accept regex lines and > > then compile them down to a parser tree it would be wonderful. > > Regex is a different beast, because for it you need to create a full- > blown > DFA, which also explains the slowness of regexes. I'll not tackle that > beast. > For some fields, I will support regex matches, but when they are used, > performance is affected. The overall idea is that you usually do not > need any > regex/DFA at all. > > > > > so once there is a high performance parser to pull the data apart, > then > > the question is what to do with it. > > > > some people will want to write it to various places, others will want > > to > > make decisions based on what is matched. > > > > for those who are wanting to write the normalized output to various > > places, a plugin structure like rsyslog has (with the ability to > format > > the messages based on the various properties that are discovered) is > > very > > appealing, and it may make a lot of sense to see what can be done to > > re-use that work. If so, there will need to be a 'format string' that > > creates the output with all the properties that are known tagged, but > > without including ones that didn't have any matches in this log > > message. > > One thing I definitely intend to do is utilize the library in rsyslog. > I > envision a parser module that works based on the library. That also > means > rsyslog's core engine must be extended to support the additional > fields, but > something that can definitely be done. With that approach, no complex > output > engine is needed - one can just use the rsyslog plugin. And with a near > O(1) > algorithm, we can probably expect that this happens in real-time even > for > very large traffic loads (but probably not for the largest ones). > > It is important to know here that the current parsers also have some > limited > backout needs, for example for the date and tag/hostname fields. So > this can > be done quickly. > > > > > for those who are wanting to then implement logic based on what it > > gets, > > thing get much more interesting. I suspect that the thing to do here > > will > > be to make the event normalization engine be something that can be a > > library included in other programs (in various languages), something > so > > that you can have the config file be something along the lines of > > that's actually the idea. An initial sketch of the API is already in > git and > I hope to get some better-readbly doxgen-generated interface spec up > later > today. > > > > documentation (hopefully including a sample raw line) > > line-to-match > > function to call when matched > > > > there are a log of programs out there written to do good and > > interesting > > stuff with lines that it receives, if there was an ability to replace > > their sequential 'does it match rule 1, does it match rule 2' logic > > with a > > more efficient parser it would be a huge win. > > > > I don't think you are wanting to tackle that portion of the task. > > The lib part yes, but that obviously requires a lot of changes for > applications using it. > > Rainer > > > > David Lang > > > > > > > Rainer > > > > > >> -----Original Message----- > > >> From: [email protected] [mailto:rsyslog- > > >> [email protected]] On Behalf Of Rainer Gerhards > > >> Sent: Monday, October 11, 2010 9:01 AM > > >> To: rsyslog-users > > >> Subject: Re: [rsyslog] liblognorm vs. libeventnorm > > >> > > >> I would like to add as an argument pro liblognorm, that many > people > > >> probably > > >> better understand what "log normalization" is whereas "event > > >> normalization" > > >> may sound strange. In that sense, liblognorm may be a better name. > > >> Feedback > > >> is appreciated. > > >> > > >> Rainer > > >> > > >>> -----Original Message----- > > >>> From: [email protected] [mailto:rsyslog- > > >>> [email protected]] On Behalf Of Rainer Gerhards > > >>> Sent: Sunday, October 10, 2010 11:53 AM > > >>> To: rsyslog-users > > >>> Subject: [rsyslog] liblognorm vs. libeventnorm > > >>> > > >>> Hi all, > > >>> > > >>> I think I'll start with the libeventnorm name for the normalizing > > >>> library > > >>> instead of liblognorm. Reason here: > > >>> > > >>> http://blog.gerhards.net/2010/10/liblognorm-or-libeventnorm.html > > >>> > > >>> Further name suggestions or arguments are very welcome! > > >>> > > >>> Rainer > > >>> _______________________________________________ > > >>> rsyslog mailing list > > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >>> http://www.rsyslog.com > > >> _______________________________________________ > > >> rsyslog mailing list > > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > > >> http://www.rsyslog.com > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

