> Then how about the suggested "xml-auto-detect"? That is better.
>> Then, I'd claim that the problem that the codec solves doesn't really >> exist. IOW, most XML parsers implement the auto-detection of encodings, >> anyway, and this is where architecturally this functionality belongs. > > But not all XML parsers support all encodings. The XML codec makes it > trivial to add this support to an existing parser. I would like to question this claim. Can you give an example of a parser that doesn't support a specific encoding and where adding such a codec solves that problem? In particular, why would that parser know how to process Python Unicode strings? > Furthermore encoding-detection might be part of the responsibility of > the XML parser, but this decoding phase is totally distinct from the > parsing phase, so why not put the decoding into a common library? I would not object to that - just to expose it as a codec. Adding it to the XML library is fine, IMO. > There's a (currently undocumented) codecs.detect_xml_encoding() in the > patch. We could document this function and make it public. But if > there's no codec that uses it, this function IMHO doesn't belong in the > codecs module. Should this function be available from xml/__init__.py or > should be put it into something like xml/utils.py? Either - or. >> Finally, I think the codec is incorrect. When saving XML to a file >> (e.g. in a text editor), there should rarely be encoding errors, since >> one could use character references in many cases. > > This requires some intelligent fiddling with the errors attribute of the > encoder. Much more than that, I think - you cannot use a character reference in an XML Name. So the codec would have to parse the output stream to know whether or not a character reference could be used. > Correct, but as long as Python doesn't have an EBCDIC codec, that won't > help much. Adding *detection* of EBCDIC to detect_xml_encoding() is > rather simple though. But it does! cp037 is EBCDIC, and supported by Python. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com