http://bugzilla.spamassassin.org/show_bug.cgi?id=3140

           Summary: HTML renderer should "remember" certain attributes of
                    rendered text
           Product: Spamassassin
           Version: unspecified
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P3
         Component: Rules (Eval Tests)
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


Currently there are a number of effective RE rules over either rawbody or 
sometimes 'full' looking for certain html attributes.  One of the most common 
is very small fonts, such as 0/1 point/pixel.  Making such a test that works 
correctly in all cases using an RE is almost impossible due to the many ways 
that a font size tag can be worded.  Even one that works in most cases will be 
moderately complex, and will take some amount of time to scan the body.

The html renderer presumably parses tags correctly, and probably already knows 
if a small font has been used in the message.  It probably doesn't really care 
since it is rendering to text, but it probably at some level knows.  If it 
could leave this information lying about, a simple eval test could be made on 
this value to determine the presence of small fonts, and assign an appropriate 
score.

There are doubtless other html attributes that are currently being detected 
with RE rules that the html renderer already found and ignored.  Bogus tags and 
ending tags might be a couple of possibilities.  If the render left flags for 
these things lying about, it would reduce the number of REs that have to be run 
on the body of the message, and thus increase efficiency.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to