Loren, Bob, Mike

Awesome explanations! Mike hit the nail on the head for the bit that I was 
uncertain about, but the explanations cleared up a lot of extra uncertainty 
surrounding the whole thing.

Thanks for your help,

Richard

-----Original Message-----
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: 28 January 2005 02:51
To: Gray, Richard; users@spamassassin.apache.org
Subject: Re: Regular expression expanding

At 09:23 AM 1/27/2005, Gray, Richard wrote:
>body
>MANGLED_CASH/(?!cash)\b[cǩ\(][_\W]{0,[EMAIL PROTECTED],5}[sz
>5\$][_\W]{0,5}h\b/i My understanding of rule matching was that the 
>'(?!cash' bit required an |
>(or) in order to work. Can anyone break down the logic of how SA tests 
>this line?

Heh.. I think your used to seeing things like (?:a|b)  which is an or operation 
with backreferencing disabled.

However, you can also have (?:a) without the | and you can have (a|b).

The deal is that (?: disables the ability to later use backreferencing, which 
is the ability to use \1 later in a expression to require a duplicate of a 
previous match.

| is just an or.

Put the two together and you have an or without backreferencing. Disabling 
backreferencing saves memory if you're not going to use it, so it's commonly 
done in SA rules.

The bit used in the MANGLED_CASH rule is a completely different syntax, despite 
it's similar appearance. (?!a) is a negative look-ahead assertion. 
ie: when evaluating the rest of the regex line, do not match if you match this. 
Here it's used to exclude "cash" from being considered a match for the mangled 
string.

There's lots of different operation modifiers that start with (?.  (?: is much 
different than (?! , (?=, or (?<!

This really is getting into advanced perl regex syntax, but if you really want 
to know about them look up:

http://perlmonks.thepen.com/236866.html

In the context of SA rules, you usually only see (?: and (?! 





---------------------------------------------------
This email from dns has been validated by dnsMSS Managed Email Security and is 
free from all known viruses.

For further information contact [EMAIL PROTECTED]




Reply via email to