[Bug 3191] New: Word boundaries are lost after HTML processing

bugzilla-daemon 18 Mar 2004 19:17:48 -0000

http://bugzilla.spamassassin.org/show_bug.cgi?id=3191


           Summary: Word boundaries are lost after HTML processing
           Product: Spamassassin
           Version: 2.63
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Rules
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


This one's tricky. :)

HTML-escaped characters cause the word boundaries to be lost after the HTML 
processor 
unescapes them. Hard to describe, easy to show. :)

The attached samples message matches:
body ACCZZAGRA /\bZZagr�/i
but not
body ACCZZAGRA /\bZZagr�\b/i

Seems as though the word boundary (\b) is lost after translation.
Putting the accented character in the middle has no effect:
body ACCZZALIS /\bZZ�lis/i
and
body ACCZZALIS /\bZZ�lis\b/i
both work properly.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 3191] New: Word boundaries are lost after HTML processing

Reply via email to