https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7952

            Bug ID: 7952
           Summary: Add rule types for text body and HTML body separately
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Hardware: PC
                OS: Windows XP
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Rules
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: Undefined

At times it would be desirable to have rules that only scanned the text/plain
or only the text/html parts of a message, and not both combined as I believe
'body' and 'rawbody' rules currently do.

It would be desirable to have both cooked and raw forms of these rule types,
corresponding to 'body' and 'rawbody'. In particular it would be desirable to
have a rule type that can look at the raw HTML formatting/text as it exists in
the mail message.

This could be done either as new entity names like 'textbody', 'rawtextbody',
'htmlbody', etc., or could be done by adding modifiers like 'body:text' and
'rawbody:html'. Personally I'd prefer the explicitness of the first option, but
I would expect either to perform the same functionality.

One possible use of such a rule would be looking for anything after </html> in
the HTML part of the message. Note that the current chunking implementation of
the body rules could get in the way of this working reliably in all cases.
While technically a separate subject, it might be worth reviewing and possibly
eliminating the chunking done on the body rules in conjunction with adding the
new rule types. That was created when processors were a good deal slower than
currently, and it has been noted many times in the past that this can make
otherwise good rules less effective than they could be.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to