[issue4661] email.parser: impossible to read messages encoded in a different encoding

R. David Murray Thu, 30 Sep 2010 19:49:55 -0700

R. David Murray <[email protected]> added the comment:

New version of the patch that adds many more tests, and handles non-ASCII bytes 
in header values by changing them to '?'s when the header value is retrieved as 
a string.  I think I'm half done.  Still to do: generate_bytes, and the doc 
updates.


By the way, another important reason to use surrogateescape rather than latin1 
is that if I miss something and the byte-containing-strings escape, it will be 
obvious that that is what happened.  Otherwise we're back in Python2 
bytes/string conflation land.

I of course make no promises about performance.  And there is an issue there in 
that every header value access is now wrapped in an additional function call 
and a regex test, at a minimum, whether there are bytes present in the input or 
not :(

----------
Added file: http://bugs.python.org/file19078/email_parse_bytes2.diff

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue4661>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4661] email.parser: impossible to read messages encoded in a different encoding

Reply via email to