why is it even trying latin-1 at all? I don't see it anywhere in feedparser.py or my code.
deelan wrote: > [EMAIL PROTECTED] wrote: > > I'm using feedparser to parse the following: > > > > <div class="indent text">Adv: Termite Inspections! Jenny Moyer welcomes > > you to her HomeFinderResource.com TM A "MUST See &hellip;</div> > > > > I'm receiveing the following error when i try to print the feedparser > > parsing of the above text: > > > > UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in > > position 86: ordinal not in range(256) > > > > Why is this happening and where does the problem lie? > > it seems that the unicode character 0x201c isn't part > of the latin-1 charset, see: > > "LEFT DOUBLE QUOTATION MARK" > <http://www.fileformat.info/info/unicode/char/201c/index.htm> > > try to encode the feedparser output to UTF-8 instead, or > use the "replace" option for the encode() method. > > >>> c = u'\u201c' > >>> c > u'\u201c' > >>> c.encode('utf-8') > '\xe2\x80\x9c' > >>> print c.encode('utf-8') > > ok, let's try replace > > >>> c.encode('latin-1', 'replace') > '?' > > using "replace" will not throw an error, but it will replace > the offending characther with a question mark. > > HTH. > > -- > deelan <http://www.deelan.com/> -- http://mail.python.org/mailman/listinfo/python-list