[
https://issues.apache.org/jira/browse/MIME4J-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615274#action_12615274
]
Stefano Bagnara commented on MIME4J-62:
---------------------------------------
MY OPINION is that rules #3, #4 and #5 are not for space optimization but for
better representation of the content when a decoding is not possible. But my
opinion is not important in resolving this issue.
We have 3 tests I see are in MessageWriteToTest
> - testBinaryAttachmentLenient
> - testBinaryAttachmentStrictError
> - testBinaryAttachmentStrictIgnore
The expected result written in this tests expect a quoted-printable encoder
supporting at least #3 and #5 spec from rfc1521.
Either we add these features or we change the expected result.
(of course it simpler to change the expected result).
I tried this locally and it seems there is another bug about a CRLF sequence
added in the roundtripping. Maybe a problem in the QuotedPrintableInputStream
or in the MimeBoundaryInputStream, no clue yet.
> Unnecessary qp encoding of SPACE and TAB characters in CodecUtil
> ----------------------------------------------------------------
>
> Key: MIME4J-62
> URL: https://issues.apache.org/jira/browse/MIME4J-62
> Project: Mime4j
> Issue Type: Bug
> Affects Versions: 0.4
> Reporter: Niklas Therning
> Priority: Minor
> Fix For: 0.4
>
>
> ATM we always encode SPACE and TAB. The result is that the output of the
> encoding is longer than necessary. According to the MIME RFC:
> (3) (White Space) Octets with values of 9 and 32 MAY be
> represented as US-ASCII TAB (HT) and SPACE characters,
> respectively, but MUST NOT be so represented at the end
> of an encoded line. Any TAB (HT) or SPACE characters
> on an encoded line MUST thus be followed on that line
> by a printable character. In particular, an "=" at the
> end of an encoded line, indicating a soft line break
> (see rule #5) may follow one or more TAB (HT) or SPACE
> characters. It follows that an octet with decimal
> value 9 or 32 appearing at the end of an encoded line
> must be represented according to Rule #1. This rule is
> necessary because some MTAs (Message Transport Agents,
> programs which transport messages from one user to
> another, or perform a portion of such transfers) are
> known to pad lines of text with SPACEs, and others are
> known to remove "white space" characters from the end
> of a line. Therefore, when decoding a Quoted-Printable
> body, any trailing white space on a line must be
> deleted, as it will necessarily have been added by
> intermediate transport agents.
> To make the encoded output as short as possible we should try to not encode
> SPACE and TAB unless they are the last character in a line.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]