Re[4]: [sniffer] Charset

2004-08-20 Thread Pete McNeil
On Friday, August 20, 2004, 2:35:35 AM, Michiel wrote: MP> Pete, even your message had a chaset header: MP> Content-Type: text/plain; charset=us-ascii Yes, a tricky gadget indeed. MP> I think you'll generate more FP's if you do something like that than FN's MP> you might have now. Aren't there

Re: [sniffer] Charset

2004-08-20 Thread Jorge Asch
Just to be clear - we're not precisely talking about spam per-se. Rather we're talking about stating that all traffic on a particular system should be only in one language as a matter of policy... Well, since 100% of my users speak english/spanish I can safely bet that NONE of my mail should h

Re: [sniffer] Charset

2004-08-20 Thread Vivek Khera
On Aug 20, 2004, at 10:36 AM, Jorge Asch wrote: Well, since 100% of my users speak english/spanish I can safely bet that NONE of my mail should have strange character sets. So I can assume if they do, they must be spam. Be careful about that. I've gotten pure English email from folks in various

Re: [sniffer] Charset

2004-08-20 Thread Scott Fisher
A troublesome one for me was Chinese, the GB2312 character set. I started weighting based on charset=GB2312 and started noticing legitimate e-mail in English from users/computers in China using the GB2312 character set. The characters a-z,A-Z are the same in the GB2312 character set. So just becaus

Re: Re[4]: [sniffer] Charset

2004-08-20 Thread Scott Fisher
-Mad, How set up is Message Sniffer to determine if an e-mail in a foreign language is spam and then code for it. I dutifully submit my Spanish spam to the spam at sortmonster.com address. It's a very, very small percentage of my overall spam, but it consistently lands in my battleground grey-weig

Re: [sniffer] Charset

2004-08-20 Thread Vivek Khera
On Aug 20, 2004, at 11:53 AM, Scott Fisher wrote: Language based spam - filtering is a tough nut. There are some very good language classifiers out there. SpamAssassin uses one which seems to be incredibly accurate given enough text. smime.p7s Description: S/MIME cryptographic signature

Re[2]: [sniffer] Charset

2004-08-20 Thread Pete McNeil
On Friday, August 20, 2004, 11:20:44 AM, Vivek wrote: VK> On Aug 20, 2004, at 10:36 AM, Jorge Asch wrote: >> Well, since 100% of my users speak english/spanish I can safely bet >> that NONE of my mail should have strange character sets. So I can >> assume if they do, they must be spam. VK> Be

Re[6]: [sniffer] Charset

2004-08-20 Thread Pete McNeil
On Friday, August 20, 2004, 12:01:31 PM, Scott wrote: SF> -Mad, SF> How set up is Message Sniffer to determine if an e-mail in a foreign SF> language is spam and then code for it. SF> I dutifully submit my Spanish spam to the spam at sortmonster.com address. SF> It's a very, very small percentage

Re: Re[2]: [sniffer] Charset

2004-08-20 Thread Scott Fisher
We don't want any violent Mad Scientists! <<< [EMAIL PROTECTED] 8/20 11:59a >>> On Friday, August 20, 2004, 11:20:44 AM, Vivek wrote: VK> On Aug 20, 2004, at 10:36 AM, Jorge Asch wrote: >> Well, since 100% of my users speak english/spanish I can safely bet >> that NONE of my mail should have s