[jira] Commented: (MIME4J-62) Unnecessary qp encoding of SPACE and TAB characters in CodecUtil

Stefano Bagnara (JIRA) Mon, 21 Jul 2008 07:19:54 -0700

    [ 
https://issues.apache.org/jira/browse/MIME4J-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615253#action_12615253
 ]


Stefano Bagnara commented on MIME4J-62:
---------------------------------------

And the SPACE and TABS special rule is not about "space optimization" rather 
about having almost all text converted to easy to read text.
e.g:
This is a message having also 8bit euro € char
i currently converted in 
This=20is=20a=20message=20having=20also=208bit=20euro=20=A4char
while the "optimized" version would be:
This is a message having also 8bit euro =A4 char

MIME specification is careful about good degradation when the content is read 
by non-mime readers or agents having issues with charsets/decoding and similar 
things.

So, I agree with Niklas and I think this issue is a good wish, but there is no 
need to work on this improvement if other committers thinks it would be bad to 
have a similar behaviour.

> Unnecessary qp encoding of SPACE and TAB characters in CodecUtil
> ----------------------------------------------------------------
>
>                 Key: MIME4J-62
>                 URL: https://issues.apache.org/jira/browse/MIME4J-62
>             Project: Mime4j
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Niklas Therning
>            Priority: Minor
>             Fix For: 0.4
>
>
> ATM we always encode SPACE and TAB. The result is that the output of the 
> encoding is longer than necessary. According to the MIME RFC:
> (3)   (White Space) Octets with values of 9 and 32 MAY be
>           represented as US-ASCII TAB (HT) and SPACE characters,
>           respectively, but MUST NOT be so represented at the end
>           of an encoded line.  Any TAB (HT) or SPACE characters
>           on an encoded line MUST thus be followed on that line
>           by a printable character.  In particular, an "=" at the
>           end of an encoded line, indicating a soft line break
>           (see rule #5) may follow one or more TAB (HT) or SPACE
>           characters.  It follows that an octet with decimal
>           value 9 or 32 appearing at the end of an encoded line
>           must be represented according to Rule #1.  This rule is
>           necessary because some MTAs (Message Transport Agents,
>           programs which transport messages from one user to
>           another, or perform a portion of such transfers) are
>           known to pad lines of text with SPACEs, and others are
>           known to remove "white space" characters from the end
>           of a line.  Therefore, when decoding a Quoted-Printable
>           body, any trailing white space on a line must be
>           deleted, as it will necessarily have been added by
>           intermediate transport agents.
> To make the encoded output as short as possible we should try to not encode 
> SPACE and TAB unless they are the last character in a line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (MIME4J-62) Unnecessary qp encoding of SPACE and TAB characters in CodecUtil

Reply via email to