On Fri, 2008-07-18 at 10:58 +0200, Stefano Bagnara wrote: > Oleg Kalnichevski ha scritto: > > On Thu, 2008-07-17 at 20:21 +0200, Stefano Bagnara wrote: > >> Oleg Kalnichevski ha scritto: > >>> Stefano Bagnara wrote: > >
... > E.g: I'm slowly coming to a possible proposal about parsing. > - strict mode: no conversion is done, a CR or LF in headers (or other > non 7bit content) make mime4j fail parsing. > - permissive modes: > - default binary: no conversion happen, isolated CR and LF are > accepted everywhere but not considered newlines (as like as other 8bit > bytes), the default content-transfer-encoding is "binary" when not > specified (7bit, 8bit and binary are read as binary). > - default text: we convert isolated CR and LF to CRLF almost > everywhere but in "binary" content-transfer-encoding parts. > I'm not proposing this yet (not sure this is enough and we don't need > more granular tweakings), but this is something I'm evaluating right > now... The strict mode is desiderable to have, but less important than > the permissive parsing (we want to be strict in output, not in input). > OTOH someone may want to use mime4j for validating if a content is > wellformed or not (wrt RFC) and in this case a strict mode would be > necessary. > > Stefano > Stefano, With all due respect but I see strict handling of line delimiters as _pointless_ orthodoxy that really does not help anyone. Would you really ship an application to a client of yours that rejects a message as invalid because it contains a lone LF in it? So what is the _point_ of being strict about line delimiters? Anyways, let's talk code now. How about this? (1) interface LineDelimiterStrategy { boolean isNewLine(char ch1, char ch2) // both can be -1 throws MimeException; } One can provide MimeTokenStream with an implementation of this interface at the construction time. MimeTokenStream it its turn passes a reference to that class to all parser components that need to deal with line delimiters. (2) The issue of CR / LF handling in content bodies should be taken of when formatting output, _not_ when parsing input. Would that work for you? Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]