Re: filtering on language
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello Thomas! On Mittwoch, 21. November 2001 at 04:56:39 you wrote: No, I just received a spam message about a diet. No sex was required, so I deleted it. You lost the bet. You must have overlooked something. Any diet without sex isn't worth a mention ... You are right. 50% of my private messages are about either sex or money, or the combination of both. Bad count, but it shows that most of us don't have a life. Well, none worth living. The correct ratio for sex-or-money messages to non-sex-or-money* should be at least 9:1. Just a hint: sex in Turkish is sex, like in most other languages I know. I just took the examples given. *This is a non-exclusive or opposed to exclusive either-or. - -- Dierk Haasis http://www.Write4U.de PGP keys available: mailto:[EMAIL PROTECTED]?Subject=SendMyPGPkeys The Bat 1.54/10e on Windows 95 4.0 67306684 C The whole problem with the world is that fools and fanatics are always so certain of themselves, but wiser people so full of doubts. (Bertrand Russell) -BEGIN PGP SIGNATURE- Version: PGP 6.5.8ckt Comment: Privacy is the core element to Freedom! iQA/AwUBO/tYK/To1oA8g8dLEQJJGACgwZxYjIOJnfWUOWrqYu8BgDeg6TgAoK79 kw4h6P3+AcP2l9gX4KIeQ2QM =vQOl -END PGP SIGNATURE- -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
filtering on language
Hello TB! Listers. I've been getting a lot of msgs like the one below its really annoying. Original Message Starts -- www.internethaber.com yilin en iyi ancorhman_n_ belirliyor. Siz de oy kullanarak tercihinizi koyun. www.internethaber.com ve www.gazeteoku.com ile biz haberin tekelini kirdik; gelin siz de bu siteleri ziyaret ederek, bu tekeli kirin. - Original Message Ends --- Can anyone imagine a built in macro that identifies language used, i.e. %IF not %English? Does anyone think this is even possible? -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jan, On Dienstag, 20. November 2001 at 14:02:14 you wrote (at least in part): JR Can anyone imagine a built in macro that identifies JR language used, i.e. %IF not %English? Does anyone think JR this is even possible? 1.) No 2.) Define 'English' and on what basis setting '%English' should be affected :-) - Now you should know why: 'No' :-) -- Regards Peter Palmreuthermailto:[EMAIL PROTECTED] (The Bat! v1.54/10e on Windows NT 5.0 Build 2195 Service Pack 2) Let's start a new country up -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jan, On Tue, 20 Nov 2001 08:02:14 -0500 GMT (20/11/2001, 21:02 +0800 GMT), Jan Rifkinson wrote: JR Can anyone imagine a built in macro that identifies JR language used, i.e. %IF not %English? Does anyone think JR this is even possible? You can build a RegEx that looks for the encoding. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Outside a disco: SMARTS IS THE MOST EXCLUSIVE DISCO IN TOWN. EVERYONE WELCOME. Message reply created with The Bat! 1.54/10 under Chinese Windows 98 4.10 Build A using an AMD Athlon K7 1.2GHz, 128MB RAM -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello TB! List. On Tue, 20 Nov 2001 at 14:29 GMT +0100 (11/20/2001 8:29 AM where I live) [EMAIL PROTECTED] [Peter] wrote to [EMAIL PROTECTED] re: 'filtering on language': Peter 2.) Define 'English' and on what basis setting Peter '%English' should be [...] Well I'm not as technically oriented as you are but I would think it could be related to the %language macro that already exists. -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jan, On Tue, 20 Nov 2001 09:04:32 -0500 GMT (20/11/2001, 22:04 +0800 GMT), Jan Rifkinson wrote: Peter 2.) Define 'English' and on what basis setting Peter '%English' should be [...] JR Well I'm not as technically oriented as you are but I JR would think it could be related to the %language macro JR that already exists. That macro will choose the dictionary you use for writing the message or reply. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Bei Vollmond spricht man nicht. Message reply created with The Bat! 1.54/10 under Chinese Windows 98 4.10 Build A using an AMD Athlon K7 1.2GHz, 128MB RAM -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Thomas. At 8:59 AM on Tuesday, November 20, 2001 you wrote the following on the posted subject 'filtering on language': JR Can anyone imagine a built in macro that identifies JR language used, i.e. %IF not %English? Does anyone think JR this is even possible? Thomas You can build a RegEx that looks for the encoding. I'm not sure what this means but since it deals with RegExp, I should move it to TBTech. Thanks, Thomas. -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Thomas. At 9:10 AM on Tuesday, November 20, 2001 you wrote the following on the posted subject 'filtering on language': Thomas That macro will choose the dictionary you use for writing the message Thomas or reply. Yes, I understand which is why I used to word related. I guess I wasn't clear because I was thinking along the lines of: %IF %TEXT does not = %language=English (it looked thru the dictionary), then Take an action. ??? -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jan, On Dienstag, 20. November 2001 at 15:19:38 you wrote (at least in part): JR I guess I wasn't clear because I was thinking along the lines of: JR %IF %TEXT does not = %language=English (it looked thru the JR dictionary), then Take an action. ??? I think this ain't as clear as you wanted it. What action (e.g.) should be token? If you want the macro in filters to scan incomming mails? In a template? What else? -- Regards Peter Palmreuthermailto:[EMAIL PROTECTED] (The Bat! v1.54/10e on Windows NT 5.0 Build 2195 Service Pack 2) I'm not sure if life is trying to pass me by or run me over! -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jan, On Tue, 20 Nov 2001 09:06:53 -0500 GMT (20/11/2001, 22:06 +0800 GMT), Jan Rifkinson wrote: Thomas You can build a RegEx that looks for the encoding. JR I'm not sure what this means but since it deals with JR RegExp, I should move it to TBTech. Yes, but only when it gets technical. g The way you do it is the looks for the Content-encoding header or any header line that contains 8859-x (substitute Turkish encoding suffix for the x) or whatever you can identify in these messages. How you do that with RegEx, now *that* is for TBTECH. Be careful though, it will catch all messages with that encoding, also legitimate ones. JR Thanks, Thomas. Don't mention it. FWIW I don't use any spam filters at all, because I get a lot more carbon-based spam in my snail mail inbox than e-spam. ;-) -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Ich bin ferner mit meinen Nerven am Ende und habe mit einer schweren Kastritis zu tun. Message reply created with The Bat! 1.54/10 under Chinese Windows 98 4.10 Build A using an AMD Athlon K7 1.2GHz, 128MB RAM -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jan, On Tue, 20 Nov 2001 09:19:38 -0500 GMT (20/11/2001, 22:19 +0800 GMT), Jan Rifkinson wrote: JR %IF %TEXT does not = %language=English (it looked JR thru the dictionary), then Take an action. ??? Let me assume you are not a programmer. ;-) What you suggest is technically possible but would mean such an overhead that it is impractical for our use. I am sure that intelligence agencies use this approach, though. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Salmon day: Swimming upstream all day to get screwed in the end. Message reply created with The Bat! 1.54/10 under Chinese Windows 98 4.10 Build A using an AMD Athlon K7 1.2GHz, 128MB RAM -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
- Original Message - From: Jan Rifkinson [EMAIL PROTECTED] To: TB! UDL [EMAIL PROTECTED] Sent: 20 November 2001 1:02 pm Subject: filtering on language Hello TB! Listers. I've been getting a lot of msgs like the one below its really annoying. Original Message Starts -- www.internethaber.com yilin en iyi ancorhman_n_ belirliyor. Siz de oy kullanarak tercihinizi koyun. www.internethaber.com ve www.gazeteoku.com ile biz haberin tekelini kirdik; gelin siz de bu siteleri ziyaret ederek, bu tekeli kirin. - Original Message Ends --- Can anyone imagine a built in macro that identifies language used, i.e. %IF not %English? Does anyone think this is even possible? It's certainly possible through some sort of dictionary comparison, but that would probably be fatally slow on modest PCs (imagine checking a 2,000 word email :) There may be more clever statistical methods - the above is Turkish, and it's pretty obvious the relative frequency of various letters (eg z and i) is entirely different from that of English - but more similar languages (eg two Indo-European ones ;) might not be so simple to differentiate. I am not a professional linguist so I can't comment definitively. Alastair _ This message has been checked for all known viruses by the MessageLabs Virus Scanning Service. For further information visit http://www.messagelabs.com/stats.asp -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Alastair, On Tue, 20 Nov 2001 14:29:05 - GMT (20/11/2001, 22:29 +0800 GMT), Alastair Scott wrote: AS There may be more clever statistical methods - the above is Turkish, and AS it's pretty obvious the relative frequency of various letters (eg z and AS i) is entirely different from that of English - This would be difficult to implement in a TB filter. But I just had an idea: You can actually filter for certain words that are likely to occur in most Turkish-language spams, such as siteler (web sites), for example. You can also use other simple words from the Turkish language. Without a scoring mechanism - i.e. just if one of those five or ten words is found, it's a hit - make your own, very simple, language parser in the form of a TB filter. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Analogies in writing are like feathers on a snake. Message reply created with The Bat! 1.54/10 under Chinese Windows 98 4.10 Build A using an AMD Athlon K7 1.2GHz, 128MB RAM -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
- Original Message - From: Thomas F [EMAIL PROTECTED] To: Alastair Scott on TBUDL [EMAIL PROTECTED] Sent: 20 November 2001 2:49 pm Subject: Re: filtering on language Hello Alastair, On Tue, 20 Nov 2001 14:29:05 - GMT (20/11/2001, 22:29 +0800 GMT), Alastair Scott wrote: AS There may be more clever statistical methods - the above is Turkish, and AS it's pretty obvious the relative frequency of various letters (eg z and AS i) is entirely different from that of English - This would be difficult to implement in a TB filter. But I just had an idea: You can actually filter for certain words that are likely to occur in most Turkish-language spams, such as siteler (web sites), for example. You can also use other simple words from the Turkish language. Without a scoring mechanism - i.e. just if one of those five or ten words is found, it's a hit - make your own, very simple, language parser in the form of a TB filter. That would work - translations of sex and money would probably catch 95 per cent of spam ;) The frequency analysis is actually very subtle - two other languages which have lots of zs that come to mind are German and Polish. The huge mass of rules needed to differentiate one language from another would probably be just as slow as the dictionary lookup. Alastair _ This message has been checked for all known viruses by the MessageLabs Virus Scanning Service. For further information visit http://www.messagelabs.com/stats.asp -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello Alastair! On Dienstag, 20. November 2001 at 15:29:05 you wrote: There may be more clever statistical methods - the above is Turkish, and it's pretty obvious the relative frequency of various letters (eg z and i) is entirely different from that of English - but more similar languages (eg two Indo-European ones ;) might not be so simple to differentiate. I am not a professional linguist so I can't comment definitively. You mean, for instance, Faeroese and Pashto? - -- Dierk Haasis http://www.Write4U.de PGP keys available: mailto:[EMAIL PROTECTED]?Subject=SendMyPGPkeys The Bat 1.54/10e on Windows 95 4.0 67306684 C Talk slowly, but think quickly. -BEGIN PGP SIGNATURE- Version: PGP 6.5.8ckt Comment: Privacy is the core element to Freedom! iQA/AwUBO/pnnPTo1oA8g8dLEQKKeQCfZzZvntnFJ6gwBV4lysgLw16f5wwAoPK5 JL/OwG3TjzYE3c3PVbaOgxOt =cAiN -END PGP SIGNATURE- -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re[2]: filtering on language
Tuesday, November 20, 2001, 4:12:25 PM, you wrote: AS - Original Message - AS From: Thomas F [EMAIL PROTECTED] AS To: Alastair Scott on TBUDL [EMAIL PROTECTED] AS Sent: 20 November 2001 2:49 pm AS Subject: Re: filtering on language On Tue, 20 Nov 2001 14:29:05 - GMT (20/11/2001, 22:29 +0800 GMT), Alastair Scott wrote: AS That would work - translations of sex and money would probably catch 95 AS per cent of spam ;) To bad sex is sex in almost any language ;-) Best regards, Gerard Real men don't ask directions -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
OT: filtering on language
Hello Alastair, On Tue, 20 Nov 2001 15:12:25 - GMT (20/11/2001, 23:12 +0800 GMT), Alastair Scott wrote: AS The frequency analysis is actually very subtle - two other languages which AS have lots of zs that come to mind are German and Polish. The huge mass of AS rules needed to differentiate one language from another would probably be AS just as slow as the dictionary lookup. I just came across some language guessers on the internet: http://www.xrce.xerox.com/research/mltt/tools/guesser/ This one identified the text Jan originally posted as Turkish_iso9. http://odur.let.rug.nl/~vannoord/TextCat/Demo/textcat.html This one identified the language as unkown, even though Turkish is in their list of supported languages. However, it is open source and - yes, a Perl script! - so you can run it in TB v2. Oh, and I just saw that he gives a comprehensive list of competitors, i.e. links to other language identifiers. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. It was so hot during football practice that a lot of kids keeled over from nervous prostitution. Message reply created with The Bat! 1.54/10 under Chinese Windows 98 4.10 Build A using an AMD Athlon K7 1.2GHz, 128MB RAM -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Peter At 9:25 AM on Tuesday, November 20, 2001 you wrote the following on the posted subject 'filtering on language': JR %IF %TEXT does not = %language=English (it looked thru the JR dictionary), then Take an action. ??? Peter I think this ain't as clear as you wanted it. What action (e.g.) Peter should be token? If you want the macro in filters to scan incomming Peter mails? In a template? What else? Move to Trash for example. -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Thomas At 9:26 AM on Tuesday, November 20, 2001 you wrote the following on the posted subject 'filtering on language': Thomas [...] The way you do it is the looks for the Thomas Content-encoding header or any header line that Thomas contains 8859-x (substitute Turkish encoding suffix Thomas for the x) or whatever you can identify in these Thomas messages. How you do that with RegEx, now *that* is Thomas for TBTECH. [...] I'm not sure I understand where this content-encoding header 8859- resides. I searched the header for this w/o result. Thomas Be careful though, it will catch all messages with Thomas that encoding, also legitimate ones. Well, my assumption is that if anyone from Turkey wants to communicate w me legitimately, they'll write me in English -- not because I'm being arrogant or nationalistic -- because that's my native language. I'm going to follow up on those translator links you posted as well. Thanks. -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re[2]: filtering on language
Hello Jan, 20. november 2001, 18:32:50, you wrote: JR I'm not sure I understand where this content-encoding JR header 8859- resides. I searched the header for this w/o JR result. Check this message. It's encoding is ISO-8859-2, so it has this in the headers: Content-Type: text/plain; charset=ISO-8859-2 JR Well, my assumption is that if anyone from Turkey wants to JR communicate w me legitimately, they'll write me in English JR -- not because I'm being arrogant or nationalistic -- JR because that's my native language. Many users don't even know how to set character encoding. Besides, with ISO-8859-x encodings you can still write English - just look at this message. The only thing that differs is the display of special characters, like ... -- Jernej Simoncic, [EMAIL PROTECTED] http://www2.arnes.si/~sopjsimo/ ICQ: 26266467 [The Bat! v1.54/10e on Windows 98 4.10.67766222. ] 1. Life can only be understood backwards, but it must be lived forwards. 2. No matter what goes wrong, there is always somebody who knew it would. -- Laws of Understanding -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hello Jernej. At 1:19 PM on Tuesday, November 20, 2001 you wrote the following on the posted subject 'filtering on language': Jernej Check this message. It's encoding is ISO-8859-2 Found it. Jernej Many users don't even know how to set character Jernej encoding. So this is not something that is set by the email client automatically? Would it be safe for me to assume that if this guy is using a Turkish email program that it would just be set up that way or if he was using a non-Turkish email client that he would have to set something that would reveal the language setting? Jernej Besides, with ISO-8859-x encodings you can still Jernej write English - just look at this message. The only Jernej thing that differs is the display of special Jernej characters, like ... So you are saying that I could also lose a lot of English msgs as well if I start fooling around with this idea as a filter? -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re[2]: filtering on language
Hello Jan, 20. november 2001, 23:01:58, you wrote: JR So this is not something that is set by the email client JR automatically? Would it be safe for me to assume that if JR this guy is using a Turkish email program that it would JR just be set up that way or if he was using a non-Turkish JR email client that he would have to set something that JR would reveal the language setting? You usually set the default encoding, but not necessary. Some mailers automatically set the encoding to the corresponding Windows codepage. JR So you are saying that I could also lose a lot of English JR msgs as well if I start fooling around with this idea as JR a filter? You'd probably loose a lot of messages, as many are written in ISO-8859-x, and in different Windows- encodings... -- Jernej Simoncic, [EMAIL PROTECTED] http://www2.arnes.si/~sopjsimo/ ICQ: 26266467 [The Bat! v1.54/10e on Windows 98 4.10.67766222. ] What you don't know will always hurt you. -- Law of Blissful Ignorance -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
On Tue, 20 Nov 2001 at 23:52 GMT +0100 (11/20/2001 5:52 PM where I live) [EMAIL PROTECTED] [Jernej] wrote to [EMAIL PROTECTED] re: 'filtering on language': Jernej You'd probably loose a lot of messages, as many are Jernej written in ISO-8859-x, and in different Windows- Jernej encodings... OK, thanks for helping me understand. -- Jan Rifkinson Ridgefield, CT USA TB! V1.54/10/W2K_SP2/PGP Key ID: 0x3F14A060 ICQ 41116329 -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com
Re: filtering on language
Hi Dierk, On Tue, 20 Nov 2001 17:01:50 +0100GMT (21/11/2001, 00:01 +0800GMT), Dierk Haasis wrote: That would work - translations of sex and money would probably catch 95 per cent of spam ;) DH I bet it would get 100%, but it doesn't matter how good it works DH positively. No, I just received a spam message about a diet. No sex was required, so I deleted it. You lost the bet. DH You have to take into account its negative: How many percent of DH legitimate - and maybe wanted - messages get caught? You are right. 50% of my private messages are about either sex or money, or the combination of both. DH In this special case - presumed the receiver doesn't even speak/read a DH language - any message in this language can be positively deleted. DH None of them would be alright. As long as no one tries to send him a DH message containing some of the trigger words (e.g. money or sex in DH Turkish) Just a hint: sex in Turkish is sex, like in most other languages I know. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Anmeldung unter: [EMAIL PROTECTED] Message reply created with The Bat! 1.53t under Chinese Windows 98 4.10 Build 1998 on a Pentium II/350 MHz. -- Archives : http://tbudl.thebat.dutaint.com Moderators : mailto:[EMAIL PROTECTED] TBTech List: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Latest Vers: 1.53d FAQ: http://faq.thebat.dutaint.com