Diez B. Roggisch wrote: >> Here's what I get with the prepending hack: >> >> >>> et.fromstring('<?xml version="1.0" encoding="gbk"?>\n' + >> open(filename).read()) >> Traceback (most recent call last): >> File "<interactive input>", line 1, in ? >> File "C:\Program >> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 960, >> in XML >> parser.feed(text) >> File "C:\Program >> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 1242, >> in feed >> self._parser.Parse(data, 0) >> ExpatError: unknown encoding: line 1, column 30 >> >> >> Are the XML encoding names different from the Python ones? The "gbk" >> encoding seems to work okay from Python: > > I had similar trouble with cElementTree and cp1252 encodings. But > upgrading to a more recent version helped. Did you try parsing with e.g. > sax?
Hmm... The builtin xml.dom.minidom and xml.sax both also fail to find the encoding: >>> import xml.dom.minidom as dom >>> dom.parseString('<?xml version="1.0" encoding="gbk"?>' + open(filename).read()) Traceback (most recent call last): File "<interactive input>", line 1, in ? File "C:\Program Files\Python\lib\site-packages\_xmlplus\dom\minidom.py", line 1925, in parseString return expatbuilder.parseString(string) File "C:\Program Files\Python\lib\site-packages\_xmlplus\dom\expatbuilder.py", line 942, in parseString return builder.parseString(string) File "C:\Program Files\Python\lib\site-packages\_xmlplus\dom\expatbuilder.py", line 223, in parseString parser.Parse(string, True) ExpatError: unknown encoding: line 1, column 30 >>> import xml.sax as sax >>> sax.parseString('<?xml version="1.0" encoding="gbk"?>' + open(filename).read(), sax.handler.ContentHandler()) Traceback (most recent call last): File "<interactive input>", line 1, in ? File "C:\Program Files\Python\lib\site-packages\_xmlplus\sax\__init__.py", line 47, in parseString parser.parse(inpsrc) File "C:\Program Files\Python\lib\site-packages\_xmlplus\sax\expatreader.py", line 109, in parse xmlreader.IncrementalParser.parse(self, source) File "C:\Program Files\Python\lib\site-packages\_xmlplus\sax\xmlreader.py", line 123, in parse self.feed(buffer) File "C:\Program Files\Python\lib\site-packages\_xmlplus\sax\expatreader.py", line 220, in feed self._err_handler.fatalError(exc) File "C:\Program Files\Python\lib\site-packages\_xmlplus\sax\handler.py", line 38, in fatalError raise exception SAXParseException: <unknown>:1:30: unknown encoding -- http://mail.python.org/mailman/listinfo/python-list