[issue4661] email.parser: impossible to read messages encoded in a different encoding

2014-10-02 Thread Yu Zhao
Yu Zhao added the comment: BytesParser.parse uses TextIOWrapper which by default translates universal newlines to '\n'. This breaks binary payload. Fix the problem by disabling the translation. -- components: +email -Library (Lib) nosy: +yu.z...@getcwd.com Added file:

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-12-12 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: I've opened a issue 10686 to address improving the RFC conformance by using unknown-8bit encoded words for 8bit bytes in headers. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: There are a couple of things I don't understand: +* :class:`~email.generator.Generator` will convert message bodies that + have a :mailheader:`ContentTransferEncoding` of 8bit and a known charset to + instead have a

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Generator converts 8bit bodies into 7bit bodies by applying an appropriate 7bit CTE. The reason it does this is that the output of Generator will often be passed to some other Python library function (most often smtplib) that can only

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Generator converts 8bit bodies into 7bit bodies by applying an appropriate 7bit CTE. The reason it does this is that the output of Generator will often be passed to some other Python library function (most often smtplib) that can only handle

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Even if smtplib accepted bytes (it currently does not), *Generator* is still producing unicode, and should produce valid unicode and still insofar as possible preserve the meaning of the original message. This means unicode acts as if

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Even if smtplib accepted bytes (it currently does not), That sounds like a critical failure. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: I can only fix one package at a time :) And in case it isn't clear, the Generator produces ASCII-only unicode, which is in many ways a rather strange API, is one of the chief motivations for email6. --

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Here is the final pre-alpha patch. This one includes the BytesFeedParser class and a test. Unless there are objections I'd like to commit this. Believing the code needs a more thorough review would be a valid objection :) --

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-08 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: After RM approval on irc, committed in r85322, with some additional doc fixes but no code changes relative to the last patch posted here. I'm leaving this open because I still want to try to improve the handling of non-ascii bytes in

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-07 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Here is an updated patch incorporating the reitveld feedback and feedback from python-dev about the API. Now we have BytesParser instead of Parser with a parsebytes method, and a message_from_binary_file helper. Generator also now

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-04 Thread Tony Meyer
Changes by Tony Meyer anadelonb...@users.sourceforge.net: -- nosy: +anadelonbrin ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661 ___ ___

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-02 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Version 4 of patch, now including doc updates. The patch set is now complete. -- Added file: http://bugs.python.org/file19110/email_parse_bytes4.diff ___ Python tracker rep...@bugs.python.org

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-02 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Rietveld issue, with a small doc addition compared to pach4: http://codereview.appspot.com/2362041 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-02 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Upload svn patch, so that Martin's new rietveld support will (hopefully) create an automatic review link. -- Added file: http://bugs.python.org/file19113/email_parse_bytes5.diff ___ Python

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-01 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: New version of patch including a BytesGenerator. -- Added file: http://bugs.python.org/file19102/email_parse_bytes3.diff ___ Python tracker rep...@bugs.python.org

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-01 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: In case it isn't clear, the code patch is now complete, so anyone who wants to give it a review, please do. I'll add the docs soon, but the basic idea is you can put bytes in by either using message_from_bytes or by using the 'ascii'

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-10-01 Thread Alex Quinn
Changes by Alex Quinn aq2...@alexquinn.org: -- nosy: -Alex Quinn ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661 ___ ___ Python-bugs-list

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-09-30 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: New version of the patch that adds many more tests, and handles non-ASCII bytes in header values by changing them to '?'s when the header value is retrieved as a string. I think I'm half done. Still to do: generate_bytes, and the doc

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-09-23 Thread Dan Buch
Changes by Dan Buch daniel.b...@gmail.com: -- nosy: +meatballhat ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661 ___ ___ Python-bugs-list

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-09-21 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: OK, I'm not entirely sure I want to post this, but Antoine and I were having a conversation about nntplib and email and I noted that unicode as an email transmission channel acts as if it required 7bit clean data. That is, that

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-09-21 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: A couple of comments: - what is `str(self.get_param('charset', 'ascii'))` supposed to achieve? does get_param() return a bytes object? - instead of ascii+surrogateescape, you could simply use latin1 -- nosy: +pitrou

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-09-21 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: The 'str' around get_param shouldn't be there, that was left over from an earlier version of the patch. I use surrogateescape rather than latin1 because using surrogateescape with ascii encoding gives me a reliable way to know whether

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-05-05 Thread Barry A. Warsaw
Changes by Barry A. Warsaw ba...@python.org: -- assignee: barry - r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661 ___ ___

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2009-11-15 Thread Timothy Farrell
Timothy Farrell tfarr...@swgen.com added the comment: Just an update for people interested: The email team has a goal of fixing the email module in time for the 3.2 release. There is the possibility of having to change some interfaces. See this document:

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2009-08-19 Thread Alex Quinn
Alex Quinn aq2...@alexquinn.org added the comment: This bug also prevents the cgi module from handling POST data with multipart/form-data. Consequently, 3.x cannot be readily used to write web apps for uploading files. See #4953: http://bugs.python.org/issue4953 -- nosy: +Alex

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2009-08-18 Thread Timothy Farrell
Changes by Timothy Farrell tfarr...@swgen.com: -- nosy: +tercero12 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661 ___ ___ Python-bugs-list

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2009-06-18 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: Is there any use case for feedparser accepting strings as input that isn't a design error waiting to bite the programmer? -- nosy: +r.david.murray priority: - high stage: - test needed type: - behavior versions: +Python 3.1,

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2009-06-18 Thread Barry A. Warsaw
Barry A. Warsaw ba...@python.org added the comment: dato: We've started some branches that try to address this, by exposing both a read-a-buncha-bytes interface and a read-a-string interface. rdm: As it turns out, yes. There are use cases for reading a string containing only ascii bytes. In

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2008-12-14 Thread Adeodato Simó
New submission from Adeodato Simó d...@net.com.org.es: Currently, email.parser/feedparser can only parse messages that come as a string, or from a file opened in text mode. Email messages, however, can contain 8bit characters in any encoding other than the local one (yet still be valid

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2008-12-14 Thread Benjamin Peterson
Changes by Benjamin Peterson musiccomposit...@gmail.com: -- assignee: - barry nosy: +barry ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4661 ___