Hi Daniel and Laci!
Thanks for pointing to those implementations.
I will have a look at them in this or the next week.
Thomas
[EMAIL PROTECTED] wrote:
Quoting Daniel Naber <[EMAIL PROTECTED]>:
On Wednesday 10 August 2005 18:12, Thomas Lange wrote:
As for how to detect the language of word or sentence one thing one
might do is to break down the text into single words and for all those
build n-grams (for example tri-grams) count all of them and assign them
probabilities for occurence in the text.
Code for such implementations is already available, eg. here (although this
one is in Java):
http://issues.apache.org/bugzilla/show_bug.cgi?id=26763
Hi,
I think, Libtextcat is ready for use: http://software.wise-guys.nl/libtextcat/
Regards
Laci
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]