I am trying to read a web page and save it in a .html file. The problem is that the web page is GB-2312 encoded, and I want to save it to the file with the same encoding or unicode. I have some code like this: url = 'http://blah/' headers = { 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' }
req = urllib2.Request(url, None, headers) page = urllib2.urlopen(req).read() file = open('btchina.html','wb') file.write(page.encode('gb-2312')) file.close() It is obviously not working, and I am hoping someone can help me. -- http://mail.python.org/mailman/listinfo/python-list