New submission from Rik: If you look at the `header_encode` method in the `Charset` class in `email.charset`, you'll see that depending on the `header_encoding` that is set on the `Charset` instance, it will either encode it using base64 or quoted-printable (QP):
http://hg.python.org/cpython/file/3a1db0d2747e/Lib/email/charset.py#l351 However, QP always uses `maxlinelen=None` and base64 doesn't. This results in the following behaviour: - If you use base64 encoding and your header size is longer than the default `maxlinelen`, it will be split over multiple lines. - If you use QP encoding with the same header it doesn't get split over multiple lines. You can easily test it with this snippet: from email.charset import Charset, BASE64, QP header = ( 'tejkstj tlkjes takldjf aseio neaoiflk asnfoieas nflkdan foeias ' 'naskln ioeasn kldan flkansoie naslk dnaslk fndaslk fneoisaf ' 'neklasn dfklasnf oiasenf lkadsn lkfanldk fas dfknaioe nas' ) charset = Charset('utf-8') charset.header_encoding = BASE64 print 'BASE64:' print charset.header_encode(header) charset.header_encoding = QP print 'QP:' print charset.header_encode(header) Which will output: BASE64: =?utf-8?b?dGVqa3N0aiB0bGtqZXMgdGFrbGRqZiBhc2VpbyBuZWFvaWZsayBhc25mb2llYXMg?= =?utf-8?b?bmZsa2RhbiBmb2VpYXMgbmFza2xuIGlvZWFzbiBrbGRhbiBmbGthbnNvaWUgbmFz?= =?utf-8?b?bGsgZG5hc2xrIGZuZGFzbGsgZm5lb2lzYWYgbmVrbGFzbiBkZmtsYXNuZiBvaWFz?= =?utf-8?b?ZW5mIGxrYWRzbiBsa2ZhbmxkayBmYXMgZGZrbmFpb2UgbmFz?= QP: =?utf-8?q?tejkstj_tlkjes_takldjf_aseio_neaoiflk_asnfoieas_nflkdan_foeias_naskln_ioeasn_kldan_flkansoie_naslk_dnaslk_fndaslk_fneoisaf_neklasn_dfklasnf_oiasenf_lkadsn_lkfanldk_fas_dfknaioe_nas?= This is inconsistent behavior. Aside from that, I think the `header_encode` method should accept an argument `maxlinelen` that defaults to an appropriate value (probably 76), but which you can overwrite on free will. This is (I think) also necessary because the `Header` class in `email.header` has a `maxlinelen` attribute that is used for the same purpose. Normally this works fine, but when you specified a charset for your header, it uses the `Charset` class and the `maxlinelen` is lost. This is happening here: http://hg.python.org/cpython/file/3a1db0d2747e/Lib/email/header.py#l368 You see, the `_encode_chunks` takes the `maxlinelen` argument but doesn't pass it on to the `header_encode` method of `charset` (which is a `Charset` instance). As such, you can see this issue in action with the following snippet: from email.header import Header maxlinelen = 9999999 print 'No charset:' print Header( u'asdfjk lasjdf sajdfl ajsdfaj sdlkfjas kfladjs flkajsdflk jsadklf jadslkfj adslkfj asdlkjf lksadjfkldas jfkldasj fkadsj fladsjf kladsjfk asdjfkldasasd kfaj kfladsj fkadsjf asdf ', maxlinelen=maxlinelen ).encode() print 'Charset with special characters:' print Header( u'attachment; filename="ajdsklfj klasdjfkl asdjfkl jadsfja sdflkads fad fads adsf dasjfkl jadslkfj dlasf asd \u6211\u6211\u6211 jo \u6211\u6211 jo \u6211\u6211"', charset='utf-8', maxlinelen=9999999 ).encode() Which will output: No charset: asdfjk lasjdf sajdfl ajsdfaj sdlkfjas kfladjs flkajsdflk jsadklf jadslkfj adslkfj asdlkjf lksadjfkldas jfkldasj fkadsj fladsjf kladsjfk asdjfkldasasd kfaj kfladsj fkadsjf asdf Charset with special characters: =?utf-8?b?YXR0YWNobWVudDsgZmlsZW5hbWU9ImFqZHNrbGZqIGtsYXNkamZrbCBhc2RqZmts?= =?utf-8?b?IGphZHNmamEgc2RmbGthZHMgZmFkIGZhZHMgYWRzZiBkYXNqZmtsIGphZHNsa2Zq?= =?utf-8?b?IGRsYXNmIGFzZCDmiJHmiJHmiJEgam8g5oiR5oiRIGpvIOaIkeaIkSI=?= This is currently an issue we're experiencing in Django, see our issue in the issue tracker: https://code.djangoproject.com/ticket/20889#comment:4 ---------- components: Library (Lib), email messages: 212011 nosy: barry, r.david.murray, rednaw priority: normal severity: normal status: open title: Charset.header_encode in email.charset doesn't take a maxlinelen argument and has inconsistent behavior with different encodings type: behavior versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue20747> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com