> Here's what I get with the prepending hack: > > >>> et.fromstring('<?xml version="1.0" encoding="gbk"?>\n' + > open(filename).read()) > Traceback (most recent call last): > File "<interactive input>", line 1, in ? > File "C:\Program > Files\Python\lib\site-packages\elementtree\ElementTree.py", line 960, in > XML > parser.feed(text) > File "C:\Program > Files\Python\lib\site-packages\elementtree\ElementTree.py", line 1242, > in feed > self._parser.Parse(data, 0) > ExpatError: unknown encoding: line 1, column 30 > > > Are the XML encoding names different from the Python ones? The "gbk" > encoding seems to work okay from Python:
I had similar trouble with cElementTree and cp1252 encodings. But upgrading to a more recent version helped. Did you try parsing with e.g. sax? Diez -- http://mail.python.org/mailman/listinfo/python-list