New submission from Rik:

If you look at the `header_encode` method in the `Charset` class in 
`email.charset`, you'll see that depending on the `header_encoding` that is set 
on the `Charset` instance, it will either encode it using base64 or 
quoted-printable (QP):

However, QP always uses `maxlinelen=None` and base64 doesn't. This results in 
the following behaviour:

- If you use base64 encoding and your header size is longer than the default 
`maxlinelen`, it will be split over multiple lines.
- If you use QP encoding with the same header it doesn't get split over 
multiple lines.

You can easily test it with this snippet:

    from email.charset import Charset, BASE64, QP

    header = (
        'tejkstj tlkjes takldjf aseio neaoiflk asnfoieas nflkdan foeias '
        'naskln ioeasn kldan flkansoie naslk dnaslk fndaslk fneoisaf '
        'neklasn dfklasnf oiasenf lkadsn lkfanldk fas dfknaioe nas'

    charset = Charset('utf-8')

    charset.header_encoding = BASE64
    print 'BASE64:'
    print charset.header_encode(header)

    charset.header_encoding = QP
    print 'QP:'
    print charset.header_encode(header)

Which will output:


This is inconsistent behavior.

Aside from that, I think the `header_encode` method should accept an argument 
`maxlinelen` that defaults to an appropriate value (probably 76), but which you 
can overwrite on free will.

This is (I think) also necessary because the `Header` class in `email.header` 
has a `maxlinelen` attribute that is used for the same purpose. Normally this 
works fine, but when you specified a charset for your header, it uses the 
`Charset` class and the `maxlinelen` is lost. This is happening here:

You see, the `_encode_chunks` takes the `maxlinelen` argument but doesn't pass 
it on to the `header_encode` method of `charset` (which is a `Charset` 

As such, you can see this issue in action with the following snippet:

    from email.header import Header

    maxlinelen = 9999999

    print 'No charset:'
    print Header(
        u'asdfjk lasjdf sajdfl ajsdfaj sdlkfjas kfladjs flkajsdflk jsadklf 
jadslkfj adslkfj asdlkjf lksadjfkldas jfkldasj fkadsj fladsjf kladsjfk 
asdjfkldasasd kfaj  kfladsj fkadsjf asdf ',

    print 'Charset with special characters:'
    print Header(
        u'attachment; filename="ajdsklfj klasdjfkl asdjfkl jadsfja sdflkads fad 
fads adsf dasjfkl jadslkfj dlasf asd \u6211\u6211\u6211 jo \u6211\u6211 jo 

Which will output:

    No charset:
    asdfjk lasjdf sajdfl ajsdfaj sdlkfjas kfladjs flkajsdflk jsadklf jadslkfj 
adslkfj asdlkjf lksadjfkldas jfkldasj fkadsj fladsjf kladsjfk asdjfkldasasd 
kfaj  kfladsj fkadsjf asdf
    Charset with special characters:

This is currently an issue we're experiencing in Django, see our issue in the 
issue tracker:

components: Library (Lib), email
messages: 212011
nosy: barry, r.david.murray, rednaw
priority: normal
severity: normal
status: open
title: Charset.header_encode in email.charset doesn't take a maxlinelen 
argument and has inconsistent behavior with different encodings
type: behavior
versions: Python 2.7

Python tracker <>
Python-bugs-list mailing list

Reply via email to