[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-12-03 Thread R. David Murray

R. David Murray  added the comment:

The PR has been committed.

--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed
type: crash -> behavior
versions:  -Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-22 Thread R. David Murray

R. David Murray  added the comment:

Great, thank you for that research.  And yes, that's exactly why __str__ uses 
utf8=True, the "picture" of the message is much more readable.  I will commit 
that PR soon.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-21 Thread calimeroteknik

calimeroteknik  added the comment:

Eventually there is no bug, I was just confused at the output of print() on the 
EmailMessage.

I noticed that in email/_header_value_parser.py policy.utf8 was True.
The reason is found in email/message.py line 970 (class MIMEPart):

def __str__(self):
return self.as_string(policy=self.policy.clone(utf8=True)

print() will use __str__() and this is why it happens.

I didn't dig out the exact reason since there are so many delegated calls.
In any case, the flattened message in smtplib.SMTP does contain what 
as_string() returns, which means that the policy.utf8 is only forced when using 
print().

Sorry for the false alert.
I can guess that the intention in forcing policy.utf8=True in __str__() was 
that SMTPUTF8 output is visually prettier than any ASCII-armored text.

After additional fuzzing, checking the output with EmailMessage.as_string(), 
everything seems OK.

That's a +1 for gh-3488, which fixes this bug.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-21 Thread R. David Murray

R. David Murray  added the comment:

You are correct, that is a bug.  Presumably I forgot to check for non-ascii 
when the parameter value doesn't need to be folded.  I'm not sure when I'll 
have time to look at this, unfortunately :(.  If you can see how to fix it, you 
could submit a PR against my PR branch, I think.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-21 Thread calimeroteknik

calimeroteknik  added the comment:

I confirm that as for the crash, the patch in gh-3488 fixes it.
The first code excerpt in my initial report now outputs the following, valid 
headers:

Content-Type: text/plain
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*0*=utf-8''I%20thought%20I%20could%20put%20a%20few%20words%20in%20th;
 filename*1*=e%20filename%20but%20apparently%20it%20does%20not%20go%20so%20we;
 filename*2*=ll.txt
MIME-Version: 1.0


However, when Unicode is added and the filename is short, things don't look 
right, this code:

import email.message
mail = email.message.EmailMessage()
mail.add_attachment(b"test", maintype="text", subtype="plain", filename="é.txt")
print(mail)

Results in these headers:

Content-Type: text/plain
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="é.txt"
MIME-Version: 1.0

To begin with, it is easy to deduce that there is no way to know that this 'é' 
character is UTF-8.
And it's two 8-bit values at east one of which is detectably outside of 7-bit 
US-ASCII.


Quoting https://tools.ietf.org/html/rfc2231#page-4:
>a lightweight encoding mechanism is needed to accommodate 8-bit information in 
>parameter values.

The 8-bit encoding goes straight through instead of undergoing the encoding 
process, which seems required in my interpretation of RFC2231.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-20 Thread R. David Murray

R. David Murray  added the comment:

Does the patch in gh-3488 fix this?  I think it should, or if it doesn't that's 
a bug in the PR patch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-20 Thread calimeroteknik

calimeroteknik  added the comment:

Erratum: the output generated by python 3.5 and 3.4 causes line wraps in the 
SMTP delivery chain, which cause exactly the same breakage as ulterior 
versions: the crucially needed indendation of one space ends up being absent.

--
versions: +Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31831] EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output

2017-10-20 Thread calimeroteknik

New submission from calimeroteknik :

The following code excerpt demonstrates a crash:

import email.message
mail = email.message.EmailMessage()
mail.add_attachment(
   b"test",
   maintype = "text",
   subtype  = "plain",
   filename = "I thought I could put a few words in the filename but apparently 
it does not go so well.txt" 
)
print(mail)

Output on python 3.7.0a1: 
https://gist.github.com/altendky/33c235e8a693235acd0551affee0a4f6
Output on python 3.6.2: https://oremilac.tk/paste/python-rfc2231-oops.log


Additionally, a behavioral issue is demonstrated by replacing in the above:
filename = "What happens if we try French in here? touché!.txt"


Which results in the following output (headers):

Content-Type: text/plain
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename*=utf-8''What%20happens%20if%20we%20try%20French%20in%20here%3F%20touch%C3%A9%21.txt
MIME-Version: 1.0


Instead of, for example, this correct output (by Mozilla Thunderbird here):

Content-Type: text/plain; charset=UTF-8;
 name="=?UTF-8?Q?What_happens_if_we_try_French_in_here=3f_touch=c3=a9!.txt?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*0*=utf-8''%57%68%61%74%20%68%61%70%70%65%6E%73%20%69%66%20%77%65;
 filename*1*=%20%74%72%79%20%46%72%65%6E%63%68%20%69%6E%20%68%65%72%65%3F;
 filename*2*=%20%74%6F%75%63%68%C3%A9%21%2E%74%78%74


Issues to note here:
-the "filename" parameter is not indented, mail clients ignore it
-the "filename" parameter is not split according to RFC 2231

The relevant standard is exemplified in section 4.1 of 
https://tools.ietf.org/html/rfc2231#page-5


Python 3.4.6 and 3.5.4 simply do not wrap anything, which works with  but is 
not conformant to standards.


Solving all of the above would imply correctly splitting any header.
Function "set_param" in /usr/lib/python*/email/message.py looked like a place 
to look.

Unfortunately I do not understand what's going on there very well.


As yet an additional misbehaviour to note, try to repeat the above print 
statement twice.
The result is not identical, and the second time you get the following output:

Content-Type: text/plain
Content-Transfer-Encoding: base64
Content-Disposition: 
attachment;*=utf-8''What%20happens%20if%20we%20try%20French%20in%20here%3F%20touch%C3%A9%21.txt
MIME-Version: 1.0

It would appear that "filename" has disappeared.
The issue does not reveal itself with simple values for the 'filename' 
argument, e.g. "test.txt".


PS: The above output also illustrates this (way more minor) issue: 
https://bugs.python.org/issue25235

--
components: email
messages: 304684
nosy: barry, calimeroteknik, r.david.murray
priority: normal
severity: normal
status: open
title: EmailMessage.add_attachment(filename="long or spécial") crashes or 
produces invalid output
type: crash
versions: Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com