On 8/24/06, Bowie Bailey <[EMAIL PROTECTED]> wrote:
D.J. wrote:
> On 8/24/06, Bowie Bailey <[EMAIL PROTECTED]> wrote:
> > D.J. wrote:
> > > On 8/24/06, Bart Schaefer < [EMAIL PROTECTED]> wrote:
> > > > On 8/24/06, D. J. <[EMAIL PROTECTED] > wrote:
> > > > >
> > > > > I'm expecting these type of strings for sure:
> > > > >
> > > > > cat
> > > > > dog
> > > > > cat dog
> > > > > dog cat
> > > > >
> > > > > But I may get something like this too:
> > > > >
> > > > > cat cat dog
> > > > > dog dog
> > > > >
> > > > > Essentially I want it to match if anything other than cat or
> > > > > dog is in the string.
> > > >
> > > > That constraint means you have to construct a regex that can be
> > > > anchored at both beginning and end of string, e.g.
> > > > /\A(\s*(cat|dog)\s*)+\Z/. I'm not sure that ever makes sense in
> > > > the context of a spamassassin rule, except maybe one matching
> > > > against a specific header.
> > >
> > > That's the idea... I've got the RELAY_COUNTRIES plugin that I want
> > > it to place a small score if the relay server is not in the US or
> > > Canada. However, I'm not sure if the plugin will list the same
> > > country multiple times, which is where my uncertainty in the "cat
> > > cat dog" scenario came in. So far my original rule ( !~ /cat|dog/)
> > > seems to be working well, but if I have a spammer smart enough to
> > > manage to bounce his spam originating in China off of somewhere in
> > > the US before it hits my MX, then that rule will fail. Am I
> > > possibly too paranoid?
> >
> > Ok. Try this one:
> >
> > $value =~ /\b(?!cat\b|dog\b)\w+\b/i
> >
> > This will match any word in the string as long as that word is not
> > "cat" or "dog".
>
> OK, we're actually really close. That actually matched everything I
> didn't want to match... we just have to get it to do the opposite of
> that. I have 6 test strings I tested against in a test script:
>
> cat
> dog
> cat dog
> dog cat
> bird
> cat bird
>
> It matched the top four (incorrectly).
Are you sure you used it correctly? This is a positive match (=~), not a
negative match (!~).
Test program:
@strings = ( "cat", "dog", "cat dog", "dog cat", "bird",
"cat bird", "caterwaul" );
for $str (@strings) {
if ($str =~ /\b(?!cat\b|dog\b)\w+\b/i) {
print "$str -- MATCHED\n";
}
else {
print "$str -- no match\n";
}
}
Output:
cat -- no match
dog -- no match
cat dog -- no match
dog cat -- no match
bird -- MATCHED
cat bird -- MATCHED
caterwaul -- MATCHED
--
Bowie
BINGO! I still had my negative in there, I only copied the / to / part of the regex. You sir, are the man!