I want to do some postprocessing on messages from a particular mailbox. So I use getmail which will fetch the messages and feed them to stdin of my program.
As I don't know what encoding these messages will be in, I thought it would be prudent to read stdin as binary data. Using python 3.3 on a debian box I have the following code. #!/usr/bin/python3 import sys from email import message_from_file sys.stdin = sys.stdin.detach() msg = message_from_file(sys.stdin) which gives me the following trace back File "/home/apardon/.getmail/verdeler", line 7, in <module> msg = message_from_file(sys.stdin) File "/usr/lib/python3.3/email/__init__.py", line 56, in message_from_file return Parser(*args, **kws).parse(fp) File "/usr/lib/python3.3/email/parser.py", line 58, in parse feedparser.feed(data) File "/usr/lib/python3.3/email/feedparser.py", line 167, in feed self._input.push(data) File "/usr/lib/python3.3/email/feedparser.py", line 100, in push data, self._partial = self._partial + data, '' TypeError: Can't convert 'bytes' object to str implicitly)) which seems to be rather odd. The following header are in the msg: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit So why doesn't the email parser lookup the charset and use that for converting to string type? What is the canonical way to parse an email message from stdin? -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list