On Tue, 2010-05-25 at 12:51 -0400, Jason Bertoch wrote: > On 2010/05/25 10:48 AM, Karsten Bräckelmann wrote: > > It is not a FN. It isn't even a proper message. > > > > That's some headers, plus a screen-scraped, rendered version of the > > message, including the most common headers displayed to the user. > > > > Without a RAW sample, there isn't much we could do. And no, the results > > posted by some helpful folks are irrelevant, cause they fed SA that non- > > sample. > > Wow, Karsten, you've been fiery lately, both here and the Clam list!
Yeah, I guess I was. Sorry, Jason, should cool down a bit. But on the clamav list? Oh, come on... > The message posted to pastebin was the best I could manage based on how > it came to me. The fact that it isn't a raw message is irrelevant to my > initial question; it was posted for reference only. I'm more concerned > with language detection, or the lack of it, really. Unfortunately, in this case, the fact that it isn't a proper, raw message is not irrelevant. The ok_locales setting, which is part of your original question, depends on the char-set used. Which is missing from the sample. We only can assume it was an UTF-8 encoded HTML document. The most important reason for my post shows in the last paragraph. On first glimpse, the pastebin *appears* to be a message, so a couple guys ran it through SA locally, in order to help and show which rules triggered on their end. Which of course is wasted effort, and yields confusing results. Noticed how Charles discusses the HTML rules triggered, and wondering how SA decided a large font was used in the spam? I definitely could have phrased it better, but the point is valid. The parts of this thread regarding local rules, how to improve your result and catching this FN just does not apply to the real spam. -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}