[ https://issues.apache.org/jira/browse/MIME4J-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615289#action_12615289 ]
Niklas Therning commented on MIME4J-62: --------------------------------------- Stefano, the extra CRLF you see come from TempFileTextBody.writeTo(). For the QP case I think it's incorrect for TempFileTextBody (TempFileBinaryBody also does this) to do that as that will actually alter the body. If e.g. a PNG is encoded like this it may not render properly when decoded. If it's ok with you guys I'll change all SPACEs to _ for now and remove the CRLF at the end of the QP encoded part of the MULTIPART_WITH_BINARY_ATTACHMENTS test message. I'll also change TempFileTextBody and TempFileBinaryBody to not add the extra CRLF after outputting a QP encoded body. This will fix the failing tests. And then we can come back to this issue later to make an alternative encoder for textual data. How about that? > Unnecessary qp encoding of SPACE and TAB characters in CodecUtil > ---------------------------------------------------------------- > > Key: MIME4J-62 > URL: https://issues.apache.org/jira/browse/MIME4J-62 > Project: Mime4j > Issue Type: Bug > Affects Versions: 0.4 > Reporter: Niklas Therning > Priority: Minor > Fix For: 0.4 > > Attachments: TextAttachmentEncodingTest.java > > > ATM we always encode SPACE and TAB. The result is that the output of the > encoding is longer than necessary. According to the MIME RFC: > (3) (White Space) Octets with values of 9 and 32 MAY be > represented as US-ASCII TAB (HT) and SPACE characters, > respectively, but MUST NOT be so represented at the end > of an encoded line. Any TAB (HT) or SPACE characters > on an encoded line MUST thus be followed on that line > by a printable character. In particular, an "=" at the > end of an encoded line, indicating a soft line break > (see rule #5) may follow one or more TAB (HT) or SPACE > characters. It follows that an octet with decimal > value 9 or 32 appearing at the end of an encoded line > must be represented according to Rule #1. This rule is > necessary because some MTAs (Message Transport Agents, > programs which transport messages from one user to > another, or perform a portion of such transfers) are > known to pad lines of text with SPACEs, and others are > known to remove "white space" characters from the end > of a line. Therefore, when decoding a Quoted-Printable > body, any trailing white space on a line must be > deleted, as it will necessarily have been added by > intermediate transport agents. > To make the encoded output as short as possible we should try to not encode > SPACE and TAB unless they are the last character in a line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]