> .read() returns the bytes exactly how it downloads them. It doesn't > interpret them. If those bytes are GB-2312-encoded text, that's what > they are. There's no need to reencode them. Just .write(page) (of > course, this way you don't verify that it's correct).
Alternatively, if the page is *not* gb-2312, you must first *decode* it from its original encoding. Suppose the original encoding is windows-1252, you do page = page.decode("windows-1252") page = page.encode("gb-2312") Of course, for HTML, that may be tricky, as the file may include an encoding declaration (XML declaration or http-equiv header). So if you recode it, you might have to change such declarations as well. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list