----- Original Message -----
From: "Jan Rifkinson" <[EMAIL PROTECTED]>
To: "TB! UDL" <[EMAIL PROTECTED]>
Sent: 20 November 2001 1:02 pm
Subject: filtering on language


> Hello TB! Listers.
>
>   I've been getting a lot of msgs like the one below & its
>   really annoying.
>
> ----------------> Original Message Starts <--------------
> www.internethaber.com yilin en iyi
> ancorhman_n_ belirliyor.
> Siz de oy kullanarak tercihinizi koyun.
>
> www.internethaber.com ve www.gazeteoku.com
> ile biz haberin tekelini kirdik; gelin siz
> de bu siteleri ziyaret ederek, bu tekeli
> kirin.
> -----------------> Original Message Ends <---------------
>
>   Can anyone imagine a built in macro that identifies
>   language used, i.e. "%IF not %English"? Does anyone think
>   this is even possible?

It's certainly possible through some sort of dictionary comparison, but that
would probably be fatally slow on modest PCs (imagine checking a 2,000 word
email :)

There may be more clever statistical methods - the above is Turkish, and
it's pretty obvious the relative frequency of various letters (eg "z" and
"i") is entirely different from that of English - but more similar languages
(eg two Indo-European ones ;) might not be so simple to differentiate. I am
not a professional linguist so I can't comment definitively.

Alastair



_____________________________________________________________________
This message has been checked for all known viruses by the 
MessageLabs Virus Scanning Service. For further information visit
http://www.messagelabs.com/stats.asp


-- 
________________________________________________________
Archives   : http://tbudl.thebat.dutaint.com
Moderators : mailto:[EMAIL PROTECTED]
TBTech List: mailto:[EMAIL PROTECTED]
Unsubscribe: mailto:[EMAIL PROTECTED]
Latest Vers: 1.53d
FAQ        : http://faq.thebat.dutaint.com 

Reply via email to