Stefano Bagnara ha scritto:
I noticed that at a point in past the EOLConvertingInputStream has been
removed from the chain.
I think this create issues when we parse an input file having only \n
and write it in output.
- It seems that we parse most of the code only checking for \n (what
does it happen when instead there are only \r? what should we do?)
- If the message have only newlines it seems mime4j ends up outputting
headers with CRLF and body with LF.
- If the input message have CR ending lines they are not considered by
mime4j.
IMHO either we accept LF, CR, and CRLF as CRLF or we only accept CRLF.
If we do that we have to take care of encoded nested messages: they
could have again LF, CR and CRLF like the top stream.
What is the right approach? Should we add a EOLConvertingInputStream
(CONVERT_BOTH) to every level of parsing or should we fail to parse
messages with bad newlines?
I don't like the current behaviour where we accept some malformed data
(LF alone are considered CRLF from our parser), we change some of them
(the one between headers are converted to CRLF) and we still output
malformed data.
Opinions?
I tried this patch and it seems to work fine (even if it breaks one of
our core tests that do not expect a CR in an header to be considered a
newline):
Index: src/main/java/org/apache/james/mime4j/MimeEntity.java
===================================================================
--- src/main/java/org/apache/james/mime4j/MimeEntity.java (revision
677582)
+++ src/main/java/org/apache/james/mime4j/MimeEntity.java (working copy)
@@ -197,7 +197,7 @@
InputStream instream;
if (MimeUtil.isBase64Encoding(transferEncoding)) {
log.debug("base64 encoded message/rfc822 detected");
- instream = new Base64InputStream(dataStream);
+ instream = new EOLConvertingInputStream(new
Base64InputStream(dataStream));
} else if (MimeUtil.isQuotedPrintableEncoded(transferEncoding)) {
log.debug("quoted-printable encoded message/rfc822 detected");
instream = new QuotedPrintableInputStream(dataStream);
Index: src/main/java/org/apache/james/mime4j/MimeTokenStream.java
===================================================================
--- src/main/java/org/apache/james/mime4j/MimeTokenStream.java (revision
676846)
+++ src/main/java/org/apache/james/mime4j/MimeTokenStream.java (working
copy)
@@ -143,7 +143,7 @@
private void doParse(InputStream stream, String contentType) {
entities.clear();
- rootInputStream = new RootInputStream(stream);
+ rootInputStream = new RootInputStream(new
EOLConvertingInputStream(stream));
inbuffer = new BufferedLineReaderInputStream(rootInputStream,
4 * 1024);
switch (recursionMode) {
case M_RAW:
IIRC the EOLConvertingInputStream was removed because of performance issue.
Stefano
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]