[rsyslog] upcoming log normalization work

Rainer Gerhards Thu, 05 Feb 2015 08:41:09 -0800

Hi all,

finally, I can dispense some word on upcoming work for liblognorm.


The short story is that I will have ample time in the next months to
seriously work on and improve liblognorm, including some new tooling to
make it easier to use, and make it usuable as a stand-alone tool. This will
become available to the rsyslog project via the mmnormalize module.

The full story is a bit longer ;) As some of you may already know, I have
decided to brush up my academic credentials a bit and I am working on my
MSc. I have gotten the opportunity to work on the topic of log
normalization for my thesis. This, of course, is no implementation work,
but I plan to use liblognorm as a working sample of whatever comes out of
the thesis and plan to implement and proof ideas as they come up (using
liblognorm as a testbed like I did with rsyslog during the IETF syslog
standadization process).

As such, I will try to develop liblognorm side-by-side with concept
development, but I may run into some subtle issue of original authorship:
the thesis of course must contain my own work and any third-party
suggestions in regard to algorithms must be quoted and can not count
against thesis work. So in a strange way the more good suggestions I get,
even for things I already considered, the more I run into trouble with the
thesis. Pure feedback like "this does not work for my environment" is no
problem, but sketches of algorithms are. So this is a bit complicated,
especially with the regular open source development model on ones mind.
I'll still try to work on that slippery slope, but may switch to a private
archive and "silence mode" if this turns out to become a real problem. In
any case, once the thesis is done I am more than open to discuss any
further suggestions.

What I have on my mind for liblognorm is much more than wiggeling a bit
with it. What we currently use is actually a proof of concept (a useful
one, obviously), but there are more than a couple of rough edges. I think
the core algorithm can be improved, if not replaced, and there is much more
work needed to aid in developing and maintaining sample bases. I have some
semi-automatic process for the creation of sample bases on my mind, but
that's something that really must be investigated first. Also, I think we
need a different, better, description language, ... and so on.

I have talked with Adiscon and I will work only part-time during the thesis
period and the prep work. That means I will be working less actively on
adding new features to rsyslog, but I am able to look at bug reports and
other important things. Actually, from a rsyslog PoV, I'll be working on a
big feature that is even better log normalization capability.

I need to do some prep work before I can start with the actual thesis work.
Most importantly, I need a set, hopefully large and diverse, of actual log
messages. The better this set, the better most probably the end result will
be (some heuristics will be involved for sure). I hope to receive community
support in collecting the log set. But I'll detail that in another mail.

Finally, I need to say that I am super-excited about this ability to
combine thesis work with something that I had on my mind for quite a while
but that I probably would realistically never have been able to look at in
this depth. And the implementation hopefully will be useful for the
community as well. So it's a win-win-win situation from my PoV.

Rainer
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

[rsyslog] upcoming log normalization work

Reply via email to