"Jesse Houwing" <[EMAIL PROTECTED]> writes:
> I've recently started buidling my own custom ruleset to catch some of the
> spam that has eluded the spamassassin filter of the university. Recently al
> lot of the messages I get have unbalanced html and head tags. Some even
> start with </html> and </head>.
>
> I tried to use eval:html_tag_balance('html' '<0') in a rule to test against
> this, but somehow this never triggers. The same is true for the head tag.
>
> I've included a test message on the following wiki page:
> http://www.exit0.us/index.php/UnBalancedHTMLorHEADtags
>
> I was wondering if it is a problem with Spamassassin, or if I'm just on the
> wrong track here.
(Unfortunately?) balance only checks for tags opened that were never
closed, not ones closed without being opened. It relies on the same
code that tries to determine when it is between two tags (for example,
between <title> and </title>).
The code should not be too complicated to follow. Look at HTML.pm and
maybe EvalTests.pm.
Daniel
--
Daniel Quinlan anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/ and open source consulting