Nicolas Pontoizeau wrote: > I am handling a mixed languages text file encoded in UTF-8. Theres is > mainly French, English and Asian languages. I need to detect every > asian characters in order to enclose it by a special tag for latex. > Does anybody know if there is a unicode "table of character" > implementation in python? I mean, I give a character and python replys > me with the language in which the character occurs.
Nicolas, check out the unicodedata module: http://docs.python.org/lib/module-unicodedata.html Find "import unicodedata" on this page for how to use it: http://www.amk.ca/python/howto/unicode I'm not sure if it has built-in support for finding which language block a character is in, but a table like this might help you: http://www.unicode.org/Public/UNIDATA/Blocks.txt -- Brian Beck Adventurer of the First Order -- http://mail.python.org/mailman/listinfo/python-list