RE: Re[2]: [sniffer] Charset
Pete, even your message had a chaset header: Content-Type: text/plain; charset=us-ascii I think you'll generate more FP's if you do something like that than FN's you might have now. Aren't there spamassassin config files that detect this spam? Met vriendelijke groet, ing. Michiel Prins SOS Small Office Solutions / REJECT Wannepad 27 1066 HW Amsterdam tel. 020-4082627 fax. 020-4082628 [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil Sent: vrijdag 20 augustus 2004 4:58 To: Jorge Asch Subject: Re[2]: [sniffer] Charset On Thursday, August 19, 2004, 10:45:37 PM, Jorge wrote: JA Could a filter be created that will tag as spam any messages that JA contaning NON-ascii characters? I mean allow only CHRS 1 through 255. JA I believe this fill filter out all these foreign character sets, and JA let through regular old and plain messages through... JA Of course such a rule will only apply for most of us on the western JA hemisphere... In theory this could be done, but it would be a tricky gadget - probably best done as something programatic... There are a lot of opportunities for false positives. I will think about this... Then again - why not simply block on anything that says charset= ? If it's plain old ascii, then there's no need for charset. (Lots of FPs with this, but then I would never use a filter like that... It might be very close to what you are looking for. The other way to do it would be to build patterns that match all of the known character sets -- or at least the majority. That would be a chunk of work but doable - especially with a few well placed wildcards and a good comprehensive list. _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re[4]: [sniffer] Charset
On Friday, August 20, 2004, 2:35:35 AM, Michiel wrote: MP Pete, even your message had a chaset header: MP Content-Type: text/plain; charset=us-ascii Yes, a tricky gadget indeed. MP I think you'll generate more FP's if you do something like that than FN's MP you might have now. Aren't there spamassassin config files that detect this MP spam? Just to be clear - we're not precisely talking about spam per-se. Rather we're talking about stating that all traffic on a particular system should be only in one language as a matter of policy... The distinction is small I suppose, but in my mind important. In filtering spam we're usually trying to target only messages that are unsolicited commercial email, pornography, or somehow harmful... With this other approach instead of trying to defeat what we don't want, we are trying to only accept what we do want... Not so much putting up blocks, more like putting up a huge block and punching holes. There are some SA filters that do this kind of thing... Ultimately I think it boils down to filtering out anything with a charset that is not wanted. If we achieve this by attrition (rather than attempting to capture all of the charsets at once) then we will achieve a strong result quickly at a relatively low cost and we might avoid potential false positives that are out there. MHO, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re: [sniffer] Charset
On Aug 20, 2004, at 10:36 AM, Jorge Asch wrote: Well, since 100% of my users speak english/spanish I can safely bet that NONE of my mail should have strange character sets. So I can assume if they do, they must be spam. Be careful about that. I've gotten pure English email from folks in various parts of the world who's default character set was other than one I'd expect. Charset != Language. smime.p7s Description: S/MIME cryptographic signature
Re: Re[4]: [sniffer] Charset
-Mad, How set up is Message Sniffer to determine if an e-mail in a foreign language is spam and then code for it. I dutifully submit my Spanish spam to the spam at sortmonster.com address. It's a very, very small percentage of my overall spam, but it consistently lands in my battleground grey-weight ranges. I only ask, because I have seen the amount of non-English spam trending upwards. I've noticed spam here in Russian, German, Spanish, Korean, Portuguese and Chinese. - Original Message - From: Pete McNeil [EMAIL PROTECTED] To: Michiel Prins [EMAIL PROTECTED] Sent: Friday, August 20, 2004 7:04 AM Subject: Re[4]: [sniffer] Charset On Friday, August 20, 2004, 2:35:35 AM, Michiel wrote: MP Pete, even your message had a chaset header: MP Content-Type: text/plain; charset=us-ascii Yes, a tricky gadget indeed. MP I think you'll generate more FP's if you do something like that than FN's MP you might have now. Aren't there spamassassin config files that detect this MP spam? Just to be clear - we're not precisely talking about spam per-se. Rather we're talking about stating that all traffic on a particular system should be only in one language as a matter of policy... The distinction is small I suppose, but in my mind important. In filtering spam we're usually trying to target only messages that are unsolicited commercial email, pornography, or somehow harmful... With this other approach instead of trying to defeat what we don't want, we are trying to only accept what we do want... Not so much putting up blocks, more like putting up a huge block and punching holes. There are some SA filters that do this kind of thing... Ultimately I think it boils down to filtering out anything with a charset that is not wanted. If we achieve this by attrition (rather than attempting to capture all of the charsets at once) then we will achieve a strong result quickly at a relatively low cost and we might avoid potential false positives that are out there. MHO, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re[6]: [sniffer] Charset
On Friday, August 20, 2004, 12:01:31 PM, Scott wrote: SF -Mad, SF How set up is Message Sniffer to determine if an e-mail in a foreign SF language is spam and then code for it. SF I dutifully submit my Spanish spam to the spam at sortmonster.com address. SF It's a very, very small percentage of my overall spam, but it consistently SF lands in my battleground grey-weight ranges. SF I only ask, because I have seen the amount of non-English spam trending SF upwards. I've noticed spam here in Russian, German, Spanish, Korean, SF Portuguese and Chinese. So far, so good. Most of the time we are able to recognize and tag appropriate elements in these messages and create appropriate rules. Sometimes this requires a bit of interpretation... when we feel we have a problem with something we reach for babblefish or use some of our internal abilities (Gonzo does pretty well with German, we all can take a stab at Spanish from time to time...) Most spam takes a similar form no matter what language - so we can most frequently get by with architectural features and other research tools we have. (Our robots add a lot of data to SPHUD and often grab critical elements of spam on their way in...) Hope this helps, _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
Re: Re[2]: [sniffer] Charset
We don't want any violent Mad Scientists! [EMAIL PROTECTED] 8/20 11:59a On Friday, August 20, 2004, 11:20:44 AM, Vivek wrote: VK On Aug 20, 2004, at 10:36 AM, Jorge Asch wrote: Well, since 100% of my users speak english/spanish I can safely bet that NONE of my mail should have strange character sets. So I can assume if they do, they must be spam. VK Be careful about that. I've gotten pure English email from folks in VK various parts of the world who's default character set was other than VK one I'd expect. Charset != Language. Along these lines, I saw spam today that was in english but used one of the character sets that were recently blocked by request (Only locally - no such thing will happen in the core system so nobody has to worry). I violently agree - blocking on character sets can be dangerous, so if you request these rules to be added be sure you watch for unexpected false positives afterward. ;-) _M This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html