On Tue, 29 Dec 2020 05:38:53 -0800 (PST), nikhil k wrote: ...[snip]... > import win32com.client as win32 > > ########### Functions > def getMailBody(msgFile): > start_text = "<html>" > end_text = "</html>" > with open(msgFile) as f: > data=f.read() > return data[data.find(start_text):data.find(end_text)+len(end_text)] ...[snip]... > > > Below is the error I'm getting. >============================================== > File "C:\Python\Python38-32\lib\encodings\cp1252.py", line 23, in decode > return codecs.charmap_decode(input,self.errors,decoding_table)[0] > UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 924: > character maps to <undefined>
I'm not completely sure that it's the f.read() call that produces the error, because you cut out the earlier lines of the error message, but . . . I think the problem is that the data in the file you're reading do not represent a valid "encoding" of any character string, using whatever encoding convention Python thinks you want to use. Maybe Python is assuming you're using ASCII encoding; 0x81 is certainly not a valid ASCII character. I don't know how Outlook represents messages in its .msg files; it's possible it packs ASCII text along with binary data. Maybe you can find a Python package that reads .msg files. Maybe you could read the file as a bytestring instead of as a character string. -- To email me, substitute nowhere->runbox, invalid->com. -- https://mail.python.org/mailman/listinfo/python-list