[jira] Commented: (MIME4J-62) Unnecessary qp encoding of SPACE and TAB characters in CodecUtil

Niklas Therning (JIRA) Mon, 21 Jul 2008 08:44:23 -0700

    [ 
https://issues.apache.org/jira/browse/MIME4J-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615278#action_12615278
 ]


Niklas Therning commented on MIME4J-62:
---------------------------------------

It's not trivial to implement rule #3. Seems like commons-codec doesn't either: 
http://commons.apache.org/codec/apidocs/org/apache/commons/codec/net/QuotedPrintableCodec.html

I suggest that we change the expected output of the 3 tests to make the tests 
pass. Then this issue won't be as important. As Robert pointed out, the 
encodeQuotedPrintableBinary() method is meant for encoding of binary data. It 
also encodes line endings which isn't desirable for textual data. I think we 
should add a new method, encodeQuotedPrintable() or maybe 
encodeQuotedPrintableText(), which returns more readable output and is intended 
for textual data. WDYT?

Stefano, how about creating a new issue for the CRLF problem?

> Unnecessary qp encoding of SPACE and TAB characters in CodecUtil
> ----------------------------------------------------------------
>
>                 Key: MIME4J-62
>                 URL: https://issues.apache.org/jira/browse/MIME4J-62
>             Project: Mime4j
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Niklas Therning
>            Priority: Minor
>             Fix For: 0.4
>
>
> ATM we always encode SPACE and TAB. The result is that the output of the 
> encoding is longer than necessary. According to the MIME RFC:
> (3)   (White Space) Octets with values of 9 and 32 MAY be
>           represented as US-ASCII TAB (HT) and SPACE characters,
>           respectively, but MUST NOT be so represented at the end
>           of an encoded line.  Any TAB (HT) or SPACE characters
>           on an encoded line MUST thus be followed on that line
>           by a printable character.  In particular, an "=" at the
>           end of an encoded line, indicating a soft line break
>           (see rule #5) may follow one or more TAB (HT) or SPACE
>           characters.  It follows that an octet with decimal
>           value 9 or 32 appearing at the end of an encoded line
>           must be represented according to Rule #1.  This rule is
>           necessary because some MTAs (Message Transport Agents,
>           programs which transport messages from one user to
>           another, or perform a portion of such transfers) are
>           known to pad lines of text with SPACEs, and others are
>           known to remove "white space" characters from the end
>           of a line.  Therefore, when decoding a Quoted-Printable
>           body, any trailing white space on a line must be
>           deleted, as it will necessarily have been added by
>           intermediate transport agents.
> To make the encoded output as short as possible we should try to not encode 
> SPACE and TAB unless they are the last character in a line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (MIME4J-62) Unnecessary qp encoding of SPACE and TAB characters in CodecUtil

Reply via email to