>"John Machin" <[EMAIL PROTECTED]> wrote in message >news:[EMAIL PROTECTED] >On Jan 27, 9:17 pm, glacier <[EMAIL PROTECTED]> wrote: >> On 1月24日, 下午3时29分, "Gabriel Genellina" <[EMAIL PROTECTED]> >> wrote: > >*IF* the file is well-formed GBK, then the codec will not mess up when >decoding it to Unicode. The usual cause of mess is a combination of a >human and a text editor :-)
SAX uses the expat parser. From the pyexpat module docs: Expat doesn't support as many encodings as Python does, and its repertoire of encodings can't be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If encoding is given it will override the implicit or explicit encoding of the document. --Mark -- http://mail.python.org/mailman/listinfo/python-list