Erik Bethke wrote: > I am getting an error of not well-formed at the beginning of the Korean > text in the second example. I am doing something wrong with how I am > encoding my Korean? Do I need more of a wrapper about it than simple > quotes? Is there some sort of XML syntax for indicating a Unicode > string, or does the Elementree library just not support reading of > Unicode?
XML is Unicode, and ElementTree supports all common encodings just fine (including UTF-8). > this one fails: > <?xml version="1.0" encoding="UTF-8"?> > <Vocab> > <Word L1="?????!"></Word> > </Vocab> this works just fine on my machine. what's the exact error message? what does print repr(open("test2.xml").read()) print on your machine? what happens if you attempt to parse <Vocab> <Word L1="어녕하세요!" /> </Vocab> ? </F> -- http://mail.python.org/mailman/listinfo/python-list