Hello All,

I am getting an error of not well-formed at the beginning of the Korean
text in the second example.  I am doing something wrong with how I am
encoding my Korean?  Do I need more of a wrapper about it than simple
quotes?  Is there some sort of XML syntax for indicating a Unicode
string, or does the Elementree library just not support reading of
Unicode?

here is my test snippet:

from elementtree import ElementTree
vocabXML = ElementTree.parse('test2.xml').getroot()

where I have two data files:

this one works:
<?xml version="1.0" encoding="UTF-8"?>
<Vocab>
<Word L1='Hahha'></Word>
</Vocab>

this one fails:
<?xml version="1.0" encoding="UTF-8"?>
<Vocab>
    <Word L1="ìëíìì!"></Word>
</Vocab>

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to