Pramod Vaidyanathan wrote:
The problem comes down to this. I have an email that I have received in Microsoft Outlook that contains characters outside of the ascii set. I was able to use your library to traverse through my outlook folders and select the appropriate emails etc. There are characters in the email that are ascii, extended ascii, and other. The problem I am having is when I read the "item.Body" into body, the characters "\xe2\x85\x9b", "\xe2\x85\x9c", "\xe2\x85\x9d", "\xe2\x85\x9d" become unknown characters (question marks). These characters have unicode equivalents of u215b, u215c, u215d, and u215e. They are the fractions 1/8, 3/8, 5/8, 7/8 respectively.

I'm not sure whether you're confused or I am.
Just to clarify: the Body attribute of an Outlook
MailItem is returned to Python as a unicode object.

In my case (having sent myself an email containing the
characters you mention) it looks like this:

u'\u215b\u215c\u215d\u215e'

Exactly how these chars will be output will depend on your
console, locale settings etc. If you want to replace
those as decimals, you can simply do this, eg:

body = message.Body.replace (u"\u215b", u"0.125")

You don't need to encode it to anything unless
you have some other reason to do that.

TJG
_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32

Reply via email to