Re: unicode "table of character" implementation in python

Brian Beck Tue, 22 Aug 2006 08:50:44 -0700

Nicolas Pontoizeau wrote:
> I am handling a mixed languages text file encoded in UTF-8. Theres is
> mainly French, English and Asian languages. I need to detect every
> asian characters in order to enclose it by a special tag for latex.
> Does anybody know if there is a unicode "table of character"
> implementation in python? I mean, I give a character and python replys
> me with the language in which the character occurs.


Nicolas, check out the unicodedata module:
http://docs.python.org/lib/module-unicodedata.html

Find "import unicodedata" on this page for how to use it:
http://www.amk.ca/python/howto/unicode

I'm not sure if it has built-in support for finding which language block a
character is in, but a table like this might help you:
http://www.unicode.org/Public/UNIDATA/Blocks.txt

-- 
Brian Beck
Adventurer of the First Order
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: unicode "table of character" implementation in python

Reply via email to