On Sun, 11 Aug 2013, Amir 'CG' Caspi wrote:
At 2:22 AM -0600 08/11/2013, Amir 'CG' Caspi wrote:
My regex is valid and appropriate for those comments... I tested it at
regexpal.com, which shows that all three comments match just fine (all
three get highlighted).
So... why is SA hitting only on the final comment, and ignoring the first
two?
Further confusion. Received another of these types of spam today:
http://pastebin.com/YywcFkui
My new HTML_COMMENT_GIBBERISH rule didn't hit on this one at all.
Thanks for the samples, and apologies for the tardy reply.
A COMMENT_GIBBERISH rule has been in my sandbox for a while now, but it is
not performing well in masscheck.
I broadened it a bit per your samples and it hits all of them now. We'll
see if this change improves the masscheck performance. I'm also going to
make FP-avoidance changes that should also help.
Running the email through regexpal.com shows that the regex _DOES_ hit
the comment. Why is this failing in SA even though it works in other
environments? Is there something that Perl doesn't like about my regex
syntax but that works fine in JavaScript?
I haven't tested your rule yet, but I have a comment: you are trying a bit
too hard. Don't worry about matching all the way to the end of the
comment. You don't care about gibberish past the first 100 "words". Just
make sure that the rule does not match the --> comment-end token, and stop
at 100 matched words. Past that it doesn't matter.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
The fetters imposed on liberty at home have ever been forged out
of the weapons provided for defense against real, pretended, or
imaginary dangers from abroad. -- James Madison, 1799
-----------------------------------------------------------------------
4 days until the 68th anniversary of the end of World War II