Yu Zhao added the comment:
BytesParser.parse uses TextIOWrapper which by default translates universal
newlines to '\n'. This breaks binary payload.
Fix the problem by disabling the translation.
--
components: +email -Library (Lib)
nosy: +yu.z...@getcwd.com
Added file:
R. David Murray rdmur...@bitdance.com added the comment:
I've opened a issue 10686 to address improving the RFC conformance by using
unknown-8bit encoded words for 8bit bytes in headers.
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
Antoine Pitrou pit...@free.fr added the comment:
There are a couple of things I don't understand:
+* :class:`~email.generator.Generator` will convert message bodies that
+ have a :mailheader:`ContentTransferEncoding` of 8bit and a known charset to
+ instead have a
R. David Murray rdmur...@bitdance.com added the comment:
Generator converts 8bit bodies into 7bit bodies by applying an appropriate 7bit
CTE. The reason it does this is that the output of Generator will often be
passed to some other Python library function (most often smtplib) that can only
Antoine Pitrou pit...@free.fr added the comment:
Generator converts 8bit bodies into 7bit bodies by applying an
appropriate 7bit CTE. The reason it does this is that the output of
Generator will often be passed to some other Python library function
(most often smtplib) that can only handle
R. David Murray rdmur...@bitdance.com added the comment:
Even if smtplib accepted bytes (it currently does not), *Generator* is still
producing unicode, and should produce valid unicode and still insofar as
possible preserve the meaning of the original message. This means unicode acts
as if
Antoine Pitrou pit...@free.fr added the comment:
Even if smtplib accepted bytes (it currently does not),
That sounds like a critical failure.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
R. David Murray rdmur...@bitdance.com added the comment:
I can only fix one package at a time :)
And in case it isn't clear, the Generator produces ASCII-only unicode, which
is in many ways a rather strange API, is one of the chief motivations for
email6.
--
R. David Murray rdmur...@bitdance.com added the comment:
Here is the final pre-alpha patch. This one includes the BytesFeedParser class
and a test.
Unless there are objections I'd like to commit this. Believing the code needs
a more thorough review would be a valid objection :)
--
R. David Murray rdmur...@bitdance.com added the comment:
After RM approval on irc, committed in r85322, with some additional doc fixes
but no code changes relative to the last patch posted here.
I'm leaving this open because I still want to try to improve the handling of
non-ascii bytes in
R. David Murray rdmur...@bitdance.com added the comment:
Here is an updated patch incorporating the reitveld feedback and feedback from
python-dev about the API. Now we have BytesParser instead of Parser with a
parsebytes method, and a message_from_binary_file helper. Generator also now
Changes by Tony Meyer anadelonb...@users.sourceforge.net:
--
nosy: +anadelonbrin
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
___
___
R. David Murray rdmur...@bitdance.com added the comment:
Version 4 of patch, now including doc updates.
The patch set is now complete.
--
Added file: http://bugs.python.org/file19110/email_parse_bytes4.diff
___
Python tracker rep...@bugs.python.org
R. David Murray rdmur...@bitdance.com added the comment:
Rietveld issue, with a small doc addition compared to pach4:
http://codereview.appspot.com/2362041
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
R. David Murray rdmur...@bitdance.com added the comment:
Upload svn patch, so that Martin's new rietveld support will (hopefully) create
an automatic review link.
--
Added file: http://bugs.python.org/file19113/email_parse_bytes5.diff
___
Python
R. David Murray rdmur...@bitdance.com added the comment:
New version of patch including a BytesGenerator.
--
Added file: http://bugs.python.org/file19102/email_parse_bytes3.diff
___
Python tracker rep...@bugs.python.org
R. David Murray rdmur...@bitdance.com added the comment:
In case it isn't clear, the code patch is now complete, so anyone who wants to
give it a review, please do. I'll add the docs soon, but the basic idea is you
can put bytes in by either using message_from_bytes or by using the 'ascii'
Changes by Alex Quinn aq2...@alexquinn.org:
--
nosy: -Alex Quinn
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
___
___
Python-bugs-list
R. David Murray rdmur...@bitdance.com added the comment:
New version of the patch that adds many more tests, and handles non-ASCII bytes
in header values by changing them to '?'s when the header value is retrieved as
a string. I think I'm half done. Still to do: generate_bytes, and the doc
Changes by Dan Buch daniel.b...@gmail.com:
--
nosy: +meatballhat
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
___
___
Python-bugs-list
R. David Murray rdmur...@bitdance.com added the comment:
OK, I'm not entirely sure I want to post this, but
Antoine and I were having a conversation about nntplib and email and I noted
that unicode as an email transmission channel acts as if it required 7bit clean
data. That is, that
Antoine Pitrou pit...@free.fr added the comment:
A couple of comments:
- what is `str(self.get_param('charset', 'ascii'))` supposed to achieve? does
get_param() return a bytes object?
- instead of ascii+surrogateescape, you could simply use latin1
--
nosy: +pitrou
R. David Murray rdmur...@bitdance.com added the comment:
The 'str' around get_param shouldn't be there, that was left over from an
earlier version of the patch.
I use surrogateescape rather than latin1 because using surrogateescape with
ascii encoding gives me a reliable way to know whether
Changes by Barry A. Warsaw ba...@python.org:
--
assignee: barry - r.david.murray
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
___
___
Timothy Farrell tfarr...@swgen.com added the comment:
Just an update for people interested:
The email team has a goal of fixing the email module in time for the 3.2
release. There is the possibility of having to change some interfaces.
See this document:
Alex Quinn aq2...@alexquinn.org added the comment:
This bug also prevents the cgi module from handling POST data with
multipart/form-data. Consequently, 3.x cannot be readily used to write
web apps for uploading files. See #4953:
http://bugs.python.org/issue4953
--
nosy: +Alex
Changes by Timothy Farrell tfarr...@swgen.com:
--
nosy: +tercero12
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
___
___
Python-bugs-list
R. David Murray rdmur...@bitdance.com added the comment:
Is there any use case for feedparser accepting strings as input that
isn't a design error waiting to bite the programmer?
--
nosy: +r.david.murray
priority: - high
stage: - test needed
type: - behavior
versions: +Python 3.1,
Barry A. Warsaw ba...@python.org added the comment:
dato: We've started some branches that try to address this, by exposing
both a read-a-buncha-bytes interface and a read-a-string interface.
rdm: As it turns out, yes. There are use cases for reading a string
containing only ascii bytes.
In
New submission from Adeodato Simó d...@net.com.org.es:
Currently, email.parser/feedparser can only parse messages that come
as a string, or from a file opened in text mode.
Email messages, however, can contain 8bit characters in any encoding
other than the local one (yet still be valid
Changes by Benjamin Peterson musiccomposit...@gmail.com:
--
assignee: - barry
nosy: +barry
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4661
___
31 matches
Mail list logo