There are ways to figure out language with very short text.  In fact,
one can identify language changes in documents that contain text in
multiple languages.

http://www.stanford.edu/class/ee380/Abstracts/090114.html

That's not to say that Twitter uses such methods, just that it's
possible to identify languages in tweet-size documents.


On Oct 26, 6:19 am, Nicole Simon <nee...@gmail.com> wrote:
> The language selection is useless, even with a limitation to English.
> The problem is probably that normal methods of attributing language
> are more or less based on longer text - and not text stripped down
> to 140 chars or less.
>
> If you want to make detection f.e. in search, rather get all
> results and apply common sense methods, like grep
> special words which most likely are only used in your
> language of choice.
>
> For real 'select your choice here' it is not going to work.
>
> At the current rate, this is rather hurting than helping.
> I instruct users in my book to rather use search which
> will limit itself, i.e. use German words if possible in search.
>
> Nicole
>
> --
>
> My german twitter sitehttp://mit140zeichen.de-http://twitter.com/m140z
>
> Kontakt:http://twitter.com/NicoleSimonhttps://www.xing.com/profile/Nicole_Simon
>
> skype: nicole.simon / mailto:nicole.si...@mit140zeichen.de
> phone: +49 451 899 75 03 / mobile: +49 179 499 7076

Reply via email to