[ https://issues.apache.org/jira/browse/MIME4J-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615274#action_12615274 ]
Stefano Bagnara commented on MIME4J-62: --------------------------------------- MY OPINION is that rules #3, #4 and #5 are not for space optimization but for better representation of the content when a decoding is not possible. But my opinion is not important in resolving this issue. We have 3 tests I see are in MessageWriteToTest > - testBinaryAttachmentLenient > - testBinaryAttachmentStrictError > - testBinaryAttachmentStrictIgnore The expected result written in this tests expect a quoted-printable encoder supporting at least #3 and #5 spec from rfc1521. Either we add these features or we change the expected result. (of course it simpler to change the expected result). I tried this locally and it seems there is another bug about a CRLF sequence added in the roundtripping. Maybe a problem in the QuotedPrintableInputStream or in the MimeBoundaryInputStream, no clue yet. > Unnecessary qp encoding of SPACE and TAB characters in CodecUtil > ---------------------------------------------------------------- > > Key: MIME4J-62 > URL: https://issues.apache.org/jira/browse/MIME4J-62 > Project: Mime4j > Issue Type: Bug > Affects Versions: 0.4 > Reporter: Niklas Therning > Priority: Minor > Fix For: 0.4 > > > ATM we always encode SPACE and TAB. The result is that the output of the > encoding is longer than necessary. According to the MIME RFC: > (3) (White Space) Octets with values of 9 and 32 MAY be > represented as US-ASCII TAB (HT) and SPACE characters, > respectively, but MUST NOT be so represented at the end > of an encoded line. Any TAB (HT) or SPACE characters > on an encoded line MUST thus be followed on that line > by a printable character. In particular, an "=" at the > end of an encoded line, indicating a soft line break > (see rule #5) may follow one or more TAB (HT) or SPACE > characters. It follows that an octet with decimal > value 9 or 32 appearing at the end of an encoded line > must be represented according to Rule #1. This rule is > necessary because some MTAs (Message Transport Agents, > programs which transport messages from one user to > another, or perform a portion of such transfers) are > known to pad lines of text with SPACEs, and others are > known to remove "white space" characters from the end > of a line. Therefore, when decoding a Quoted-Printable > body, any trailing white space on a line must be > deleted, as it will necessarily have been added by > intermediate transport agents. > To make the encoded output as short as possible we should try to not encode > SPACE and TAB unless they are the last character in a line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]