Chris Santerre <[EMAIL PROTECTED]> wrote:

> > -----Original Message-----
> > From: Bill Larson [mailto:[EMAIL PROTECTED]

> > 1.Spamassassins reads in the message
> > 2. It then stores the original message in two variables
> > 3. In the second variable remove all punctuation, spaces,
> > special encoded characters, foreign language characters, html
> > including html comments, and other methods used for
> > obscufaction.
>
> This will cause other problems.like if people don't space
> properly.Have you seen my pen.Is it on my desk? I cu!NT server
> died today.

Not only that, but some characters should be stripped out, as
in "pe-nis", while others should be converted to other
characters, as in "p3nis" or "penís".  Sometimes "0" is
inserted into the middle of a word, but sometimes it
substitutes for an "o".  Sometimes "." is used to separate
words, and sometimes it interrupts a word -- should it be
stripped out or changed into a space?  Sometimes multiple
characters are substituted for a single letter ("se><").
Sometimes one letter is substituted for another ("PayPaI").
Sometimes the same character is used in different words to
represent different letters ("[EMAIL PROTECTED]", "[EMAIL PROTECTED]").

The solution is anything but simple.

--
Keith C. Ivey <[EMAIL PROTECTED]>
Washington, DC



-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to