Re: Scoring for rule SUBJ_ILLEGAL_CHARS

2006-05-13 Thread Kai Schaetzl
Kelson wrote on Fri, 12 May 2006 14:23:55 -0700: > I count two: The ü in für and the ´ in MODEL´S, which is different from > the ASCII single quote/apostrophe: ' Ah, you are right, I missed the "ü", it's too "natural" for me. Nevertheless "too many" implies a bit more than *two* for me. I can't

Re: Scoring for rule SUBJ_ILLEGAL_CHARS

2006-05-12 Thread jdow
From: "Kai Schaetzl" <[EMAIL PROTECTED]> Theo Van Dinter wrote on Thu, 11 May 2006 13:49:11 -0400: fwiw, the 8-bit characters ought to be encoded in base64 or quoted-printable. then the rule wouldn't hit. I just found the same problem here with a whole bunch of messages coming from the same

Re: Scoring for rule SUBJ_ILLEGAL_CHARS

2006-05-12 Thread Kelson
Kai Schaetzl wrote: The subject line hitting in the case of our customer was: Bewerbung für INS-2006-05-4, "MODEL´S GESUCHT!!!" I can identify only one character that is outside the ASCII range. I count two: The ü in für and the ´ in MODEL´S, which is different from the ASCII single quote

Re: Scoring for rule SUBJ_ILLEGAL_CHARS

2006-05-12 Thread Kai Schaetzl
Theo Van Dinter wrote on Thu, 11 May 2006 13:49:11 -0400: > fwiw, the 8-bit characters ought to be encoded in base64 or quoted-printable. > then the rule wouldn't hit. I just found the same problem here with a whole bunch of messages coming from the same source. It seems the rule hits on *one*

Re: Scoring for rule SUBJ_ILLEGAL_CHARS

2006-05-11 Thread Theo Van Dinter
On Thu, May 11, 2006 at 07:47:15PM +0200, Keith Dunnett wrote: > I've recently had a couple of false positives caused by this rule, and think > it may be scored too highly for a single check. The e-mails in question were > in Spanish, and the Spanish word for linguistics has two accented characters

Scoring for rule SUBJ_ILLEGAL_CHARS

2006-05-11 Thread Keith Dunnett
I've recently had a couple of false positives caused by this rule, and think it may be scored too highly for a single check. The e-mails in question were in Spanish, and the Spanish word for linguistics has two accented characters which is enough to trigger this rule. Admittedly, the blacklists a