On 2/18/2014 1:26 PM, Marc Perkel wrote:
On 2/18/2014 9:32 AM, John Hardin wrote:
On Tue, 18 Feb 2014, Marc Perkel wrote:

Trying to do something complex and not sure how it's done. What I'm
looking for is to combine 2 conditions in a single regular expression
so that both have to be true for a match. Yes - I know I can make 2
SA rules and combine them but I bet there's a way to do it in one
expression. For simplicity here's the challenge.

A chuck of text has to include the word "cat" 5 time and the word
"dog" 4 times to be a match. How do you do that?
I assume there must be no restrictions on the order the occurrences
appear? That makes it rather difficult, and thus expensive.

Two individual simple "tflags multiple maxhits=N" REs and a meta to
combine them would be much more efficient than a single RE. Is this
just an intellectual exercise, and/or something not limited to the SA
environment?

Yes - no order - it is expensive - don't care. Need to be a single regex.


Try this:

(?=(?:.*?\bcat\b){5}).*?(?:.*?\bdog\b){4}

It will match on a string with at least 5 instances of "cat" and at least 4 of "dog". The "\b" anchors ensure that it will not match words like "catapult" or "dogma" -- it will also ignore strings like "catcat" or "dogcat". Remove them if you don't care about partial word matches.

If you want to match a string with the exact number of matches, that will be more complicated.

--
Bowie

Reply via email to