On 2/20/2018 6:39 PM, deoren wrote:
I've been attempting to use the re_extract() function quite a bit lately to write some simple "filters" for notification purposes. I struggled with the syntax for a while until I realized tha theĀ  and have been struggling quite a bit with the regex support for the re_extract() function. According to the http://www.rsyslog.com/regex/ page (and the re_extract function doc), Rsyslog uses POSIX ERE and "optionally" BRE expressions.

* Does anyone have a good guide or reference for the syntax needed?
* How do you switch the regex type from ERE to BRE? At at glance it appears that the BRE format is more cumbersome, so I want to make sure that I don't unintentionally switch that mode on somehow.

I found the differences between the two briefly described on this page:

https://en.wikibooks.org/wiki/Regular_Expressions/POSIX-Extended_Regular_Expressions

Does Rsyslog have complete support for ERE expressions? In other words, if I find a guide which covers ERE thoroughly, is that sufficient or are there gaps in rsyslog's support for the ERE syntax that I should be aware of?

Thanks.

Addendum to my earlier questions (which are still valid and "open" for feedback):

Real world example of what I'm working with (single line, likely wrapped by my mail client):

123.123.123.123 - abc1234 [20/Feb/2018:10:36:01 -0600] "GET http://example.org:80/servlet/SPECIFIC_PATTERN_HERE HTTP/1.1" 200 2182 "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"

here is a PCRE regex that _seems_ to do what I want:

^([0-9]+.[0-9]+.[0-9]+.[0-9]+)\s\-\s([A-Za-z0-9]+)\s\[([0-9A-Za-z:\/\s-]+)\]

and provides me with three match group results I can reference:

1. 123.123.123.123
2. abc1234
3. 20/Feb/2018:10:36:01 -0600

As I understand it, re_extract allows retrieving only a specific match at a time, so I grab the two I care about like so and save to local variables (this processing is done on the primary receiver):

set $.remote-ip = re_extract(
    $msg,
    "^([0-9]+\\.[0-9]+\\.[0-9]+.[0-9]+)\\s.\\s([A-Za-z0-9]+)\\s",
    0, 1,
    'unknown remote ip');

set $.remote-user = re_extract(
    $msg,
    "^([0-9]+\\.[0-9]+\\.[0-9]+.[0-9]+)\\s.\\s([A-Za-z0-9]+)\\s",
    0, 2,
    'unknown remote user');


This seems to work and only required escaping the escape character (how "meta").

I've read that mmnormalize is recommended over regexes for performance reasons, but I have little experience with liblognorm (other than knowing it exists). Am I better off writing a few regex matches like I'm doing above or crafting (and testing) liblognorm rulesets, using them with mmnormalize to generate a JSON structure and then pulling what I want from a JSON structure?

In this case, my specific goal is to look for log messages containing "SPECIFIC_PATTERN_HERE" (as shown in sample log message) and if a match is found parse the message to pull out specific values. Those values are then used to generate a notification for our ticketing system (e.g., specific URL patterns indicate abuse that we need to review further before our vendor contacts us and threatens to cut off service). In this case we're not matching a possible range of patterns, but a very specific string that is known to us.

I know there are dedicated tools for pattern matching and reporting (Graylog is something I'm kicking the tires on and I've heard that Riemann is designed for tasks like this), but I was hoping to get some basic monitoring in place now with a tool that I'm halfway familiar with before attempting to implement other tools for easier management of more complex patterns. I've already implemented 4-5 other notifications and it's worked well thus far, but I wanted to get input from the community to see if I'm going about this the wrong way (first using regexes over mmnormalize, then as a secondary issue using rsyslog for notifications vs Graylog or Riemann).
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to