[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-09 Thread Vajrasky Kok

Vajrasky Kok added the comment:

 So why check specifically for 8bit and base64?

I was in hurry when creating this patch. I think that was part of the debug 
code. This patch was created with the purpose to illuminate the culprit of this 
bug. And it served its purpose.

After uploading this patch, I was about to refine the patch. Then I started to 
think about shouldn't there be a more elegant way to solve this problem? 
Cloning the msg after entering the flatten message? It is supposed to be a 
clean and simple solution. Then I hit the part where the boundary computation 
intentionally modifies the message object. Then I remembered the email API had 
too much magics on it. .

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-08 Thread Vajrasky Kok

Vajrasky Kok added the comment:

Actually, I am thinking of this approach (cloning the message just after 
entering the flatten method):

diff -r b541ecd32115 Lib/email/generator.py
--- a/Lib/email/generator.pyFri Feb 07 16:11:17 2014 -0800
+++ b/Lib/email/generator.pySat Feb 08 15:55:01 2014 +0800
@@ -67,7 +67,7 @@
 # Just delegate to the file object
 self._fp.write(s)
 
-def flatten(self, msg, unixfrom=False, linesep=None):
+def flatten(self, msg_obj, unixfrom=False, linesep=None):
 rPrint the message object tree rooted at msg to the output file
 specified when the Generator instance was created.
 
@@ -86,6 +86,8 @@
 # from the msg, and _encoded_XXX constants for operating on data that
 # has already been converted (to bytes in the BytesGenerator) and
 # inserted into a temporary buffer.
+import copy
+msg = copy.copy(msg_obj)
 policy = msg.policy if self.policy is None else self.policy
 if linesep is not None:
 policy = policy.clone(linesep=linesep)

It works for this bug but apparently it fails one test.

==
FAIL: test_make_boundary (__main__.TestMessageAPI)
--
Traceback (most recent call last):
  File Lib/test/test_email/test_email.py, line 199, in test_make_boundary
'multipart/form-data; boundary===')
AssertionError: 'multipart/form-data' != 'multipart/form-data; boundary==='
- multipart/form-data
+ multipart/form-data; boundary===

So somehow, according to the ultimate design of email library, msg can not 
leave flatten method unscratched. :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-08 Thread R. David Murray

R. David Murray added the comment:

I'm guessing the problem is that copy is a shallow copy, and message objects 
can be deeply nested.  Doing a deepcopy might work, but as things are 
currently, that could cost a lot of memory, so I don't want to do that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-08 Thread R. David Murray

R. David Murray added the comment:

Ah, it is because the boundary computation intentionally modifies the message 
object.  Which is a questionable design decision overall, but is now set in 
stone :(.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-08 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 34fb36972f8d by R David Murray in branch '3.3':
#19772: Do not mutate message when downcoding to 7bit.
http://hg.python.org/cpython/rev/34fb36972f8d

New changeset 2e97d3500970 by R David Murray in branch 'default':
Merge #19772: Do not mutate message when downcoding to 7bit.
http://hg.python.org/cpython/rev/2e97d3500970

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-08 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


--
resolution:  - fixed
stage: needs patch - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-08 Thread R. David Murray

R. David Murray added the comment:

Vajrasky: thanks for your work on this, it helped me find a reasonable (if 
ugly) solution.  However, the 'if' statement in your patch that checks 
specifically for the combination of old and new cte of 8bit and base64 puzzles 
me.  The same problem occurs here if the CTE we end up with is 
quoted-printable, which will be the case, for example, for latin-1.  The 
logical way to think abut the problem here (we don't want to modify the 
message) would be if the CTE changed, we need to restore it.  So why check 
specifically for 8bit and base64?  The straightforward translation of if the 
CTE changed, we need to restore it would have been: 'if self.orig_cte != cte'.

In my solution I preferred to leave the original object unchanged, and re-copy 
it to re-change the headers when needed.  This is ugly, but it is more future 
proof against the down-coding code someday changing other headers or other 
characteristics of the message.  And I needed to use deepcopy, because copy 
would just copy a pointer to the _headers list, and therefore the original 
headers list would still get modified.  The deepcopy isn't that costly here, 
though, because we know that it is only called on a message that is not a 
multipart, since the copies are triggered only from _handle_text.  It's still 
not pretty, but I think it is the best we can do without completely 
restructuring the way the generator does its work.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-07 Thread R. David Murray

R. David Murray added the comment:

While the copy solution seems reasonable, unfortunately it looks like the 
solution isn't that simple.

Your test method didn't start with 'test_', so it didn't get run.  Your fix 
doesn't fix the problem, since other parts of the code are holding on to a 
pointer to the original message object.  But even if I change your fix to only 
make a local copy and use it to get the transformed payload, that fix breaks 
other tests, since the *headers* aren't changed in the output, even if the 
payload is.  This is because of the odd way the generator accumulates its 
output, and I unfortunately don't have time to remember how it all works in 
order to craft a fix before the 3.4 alpha :(.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2014-02-07 Thread Vajrasky Kok

Vajrasky Kok added the comment:

 Your test method didn't start with 'test_', so it didn't get run.

Ah sorry about that. This is the updated patch. It fixed the problem and passed 
all test.

--
Added file: 
http://bugs.python.org/file33987/fix_serialization_message_object_mutation_v4.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2013-11-27 Thread Vajrasky Kok

Vajrasky Kok added the comment:

I come out with two patches.

The first patch using bytes generator to copy the message before mutating it in 
string generator, then restore it after printing the mutated message.

--
keywords: +patch
Added file: 
http://bugs.python.org/file32865/fix_serialization_message_object_mutation_v1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2013-11-27 Thread Vajrasky Kok

Vajrasky Kok added the comment:

The second patch copies the private variables of the message before mutating it 
in string generator, then restore the private variables after printing the 
mutated message.

--
Added file: 
http://bugs.python.org/file32866/fix_serialization_message_object_mutation_v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2013-11-27 Thread Vajrasky Kok

Vajrasky Kok added the comment:

I got another inspiration.

The third patch uses copy.copy to copy the Message before mutating it in string 
generator, then restore it after printing the mutated message.

--
Added file: 
http://bugs.python.org/file32867/fix_serialization_message_object_mutation_v3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19772] str serialization of Message object may mutate the payload and CTE.

2013-11-25 Thread R. David Murray

New submission from R. David Murray:

Currently the string generator will downcast 8bit body parts to 7bit by 
encoding them using a 7bit CTE.  This is good.  However, the way it does it is 
to modify the Message object accordingly.  This is not good.  Except for 
filling in required missing bits of data (such as MIME borders), serializing a 
Message should not mutate it.  (And, for the curious, I am the one who made 
this mistake when I wrote the code :)

--
components: Library (Lib), email
keywords: easy
messages: 204356
nosy: barry, r.david.murray, vajrasky
priority: normal
severity: normal
stage: needs patch
status: open
title: str serialization of Message object may mutate the payload and CTE.
type: behavior
versions: Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19772
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com