> Both my python2.3 and python2.4 interpreters seem to know "Windows-1252": > >>>> import codecs >>>> codecs.open("windows.xml", encoding="windows-1252") > <open file 'windows.xml', mode 'rb' at 0x403737e0> > > Maybe the problem lies in the python installation rather than > cElementTree? Just guessing, though.
Hm. No idea why I was under the impression they weren't there - but still, it doesn't work: I get inf = file(sys.argv[1]) #inf = codecs.StreamRecoder(inf,encoder, decoder, reader, writer) for event, elem in cElementTree.iterparse(inf): pass pukes on me with Traceback (most recent call last): File "./splitter.py", line 31, in ? for event, elem in cElementTree.iterparse(inf): File "<string>", line 61, in __iter__ SyntaxError: not well-formed (invalid token): line 35, column 34 That is the first french character encountered. """<title>Introduction aux Probabilités</title>""" So - then the problem is not the codec being ignored, but it simply is not working. Regards, Diez -- http://mail.python.org/mailman/listinfo/python-list