On Wed, 2010-05-26 at 10:35 +1200, Jason Haar wrote:
> On 05/26/2010 05:24 AM, Karsten Bräckelmann wrote:
> > Unfortunately, in this case, the fact that it isn't a proper, raw
> > message is not irrelevant. The ok_locales setting, which is part of your
> > original question, depends on the char-set used. Which is missing from
> > the sample. We only can assume it was an UTF-8 encoded HTML document.
> 
> Even that is a legitimate corner case. What does SA do with an UTF8
> email where that charset isn't explicitly mentioned, but the

Not as far as ok_locales and the respective CHARSET_FARAWAY rules are
concerned, IIRC. They have been written long ago to trigger on the
char-sets used. They don't detect the char-set based on the actual
payload.

> Content-Transfer-Encoding: is set to "8bit"? I think that is non-RFC
> compliant, but I also know that Thunderbird resolves it just fine (not
> that it should of) - so it's a "legitimate" way for a spammer to send spam.
> 
> Here's a link to the Greek one I got recently. UTF8, Greek and yet
> FARAWAY didn't trigger (I have "ok_locales en"). I even have TextCat
> enabled (didn't work for this email) - but I don't think it's used by
> the charset stuff anyway?

Yup, these are entirely unrelated.


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to