Re: UTF8 & HTMLParser

2006-11-30 Thread Klaus Alexander Seistrup
Jan Danielsson wrote: >However, I would like to convert the "text" (which is utf8) > to latin-1. How do I do that? How about: latin = unicode(text, 'utf-8').encode('iso-8859-1') Please see help(u''.encode) for details about error handling. You might also want to trap errors in a

Re: UTF8 & HTMLParser

2006-11-30 Thread Jan Danielsson
Jan Danielsson wrote: > Hello all, > >I'm writing a python script which fetches a HTML-page (using wget), > and then parses the retrieved page using a custom htmllib HTMLParser. > >The page I fetch is encoded in utf8, and my text-handler currently > looks like this: > >def handle_dat

UTF8 & HTMLParser

2006-11-30 Thread Jan Danielsson
Hello all, I'm writing a python script which fetches a HTML-page (using wget), and then parses the retrieved page using a custom htmllib HTMLParser. The page I fetch is encoded in utf8, and my text-handler currently looks like this: def handle_data(self, text): if self.inOption: