2011/4/19 Oleg Kalnichevski <[email protected]>: > On Tue, 2011-04-19 at 16:32 +0200, Oleg Kalnichevski wrote: >> On Tue, 2011-04-19 at 09:57 +0000, [email protected] wrote: >> > Author: bago >> > Date: Tue Apr 19 09:57:36 2011 >> > New Revision: 1094982 >> > >> > URL: http://svn.apache.org/viewvc?rev=1094982&view=rev >> > Log: >> > Added some more testmsgs generated by me >> > >> >> Stefano >> >> I am seeing a number of failures in the test cases. Are these test cases >> expected to fail or is this on oversight? >> >> As far as I can tell presently the mime stream parser chokes on >> boundaries with white space characters in them. What should be the >> expected behavior? >> >> --- >> Tests in error: >> >> badbound.msg(org.apache.james.mime4j.parser.MimeStreamParserExampleMessagesTest): >> Boundary may not contain CR or LF >> >> multi-clen.msg(org.apache.james.mime4j.parser.MimeStreamParserExampleMessagesTest): >> Boundary may not contain CR or LF >> >> multi-badnames.msg(org.apache.james.mime4j.parser.MimeStreamParserExampleMessagesTest): >> Boundary may not contain CR or LF >> >> multi-simple.msg(org.apache.james.mime4j.parser.MimeStreamParserExampleMessagesTest): >> Boundary may not contain CR or LF >> >> multi-digest.msg(org.apache.james.mime4j.parser.MimeStreamParserExampleMessagesTest): >> Boundary may not contain CR or LF >> --- >> >> Cheers >> >> Oleg >> > > Stefano, > > Actually looks like mime4j is perfectly capable of parsing these > messages despite of the boundary containing illegal characters. > > Could you please review those messages and adjust expectations in the > test cases if that makes sense?
>From my reading of the RFC a boundary can contain a space. When the boundary contains a space the header folding algorythm is allowed to fold the header right where the space is present. So, you should make sure RawFieldParser correclty unfold the header before/while parsing it. This way the messages will get parsed correctly, right? Here is a MIME message from http://www.faqs.org/rfcs/rfc1341.html ----- From: Nathaniel Borenstein <[email protected]> To: Ned Freed <[email protected]> Subject: Sample message MIME-Version: 1.0 Content-type: multipart/mixed; boundary="simple boundary" This is the preamble. It is to be ignored, though it is a handy place for mail composers to include an explanatory note to non-MIME compliant readers. --simple boundary This is implicitly typed plain ASCII text. It does NOT end with a linebreak. --simple boundary Content-type: text/plain; charset=us-ascii This is explicitly typed plain ASCII text. It DOES end with a linebreak. --simple boundary-- This is the epilogue. It is also to be ignored. ------ As you can see the boundary contains a space and is folded. From my test mime4j currently fails to correctly parse this. But this is what I remember from a test done few days ago. If you think this is not the case I will check better and open a more detailed JIRA issue. Stefano
