On Wednesday 10 August 2005 18:12, Thomas Lange wrote:

> As for how to detect the language of word or sentence one thing one
> might do is to break down the text into single words and for all those
> build n-grams (for example tri-grams) count all of them and assign them
> probabilities for occurence in the text.

Code for such implementations is already available, eg. here (although this 
one is in Java):
http://issues.apache.org/bugzilla/show_bug.cgi?id=26763

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to