Re: Handling some isolated iso-8859-1 characters

2008-06-04 Thread Max M
Daniel Mahoney skrev: The interesting patch is the string that reads "=?iso-8859-1?Q?Ana=EFs?=". An HTML rendering of what this string should look would be "Anaïs". There is a mention of email headers and unicode in the end of this article: http://mxm-mad-science.blogspot.com/2008/03/python-

Re: Handling some isolated iso-8859-1 characters

2008-06-04 Thread Daniel Mahoney
> ... print ord(c), unicodedata.name(c) > ... > 65 LATIN CAPITAL LETTER A > 110 LATIN SMALL LETTER N > 97 LATIN SMALL LETTER A > 239 LATIN SMALL LETTER I WITH DIAERESIS > 115 LATIN SMALL LETTER S Looks like I need to explore the unicodedata class. Thanks! -- http://mail.python.org/mailman/listin

Re: Handling some isolated iso-8859-1 characters

2008-06-04 Thread Daniel Mahoney
> No, it's not you, those headers are formatted following RFC 2047 > > Python already has support for that format, use the email.header class, > see Excellent, that's exactly what I was looking for.

Re: Handling some isolated iso-8859-1 characters

2008-06-03 Thread Justin Ezequiel
On Jun 4, 2:38 am, Daniel Mahoney <[EMAIL PROTECTED]> wrote: > I'm working on an app that's processing Usenet messages. I'm making a > connection to my NNTP feed and grabbing the headers for the groups I'm > interested in, saving the info to disk, and doing some post-processing. > I'm finding a few

Re: Handling some isolated iso-8859-1 characters

2008-06-03 Thread Gabriel Genellina
En Tue, 03 Jun 2008 15:38:09 -0300, Daniel Mahoney <[EMAIL PROTECTED]> escribió: I'm working on an app that's processing Usenet messages. I'm making a connection to my NNTP feed and grabbing the headers for the groups I'm interested in, saving the info to disk, and doing some post-processing.

Handling some isolated iso-8859-1 characters

2008-06-03 Thread Daniel Mahoney
I'm working on an app that's processing Usenet messages. I'm making a connection to my NNTP feed and grabbing the headers for the groups I'm interested in, saving the info to disk, and doing some post-processing. I'm finding a few bizarre characters and I'm not sure how to handle them pythonically.