[ 
https://issues.apache.org/jira/browse/MIME4J-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022770#comment-13022770
 ] 

Oleg Kalnichevski commented on MIME4J-116:
------------------------------------------

It appears that this issue can only be resolved by moving Field and FieldPaser 
interfaces from DOM to Core. If we want to keep a very strict separation of 
responsibilities between Core and DOM (Core deals with RawFields only whereas 
DOM is responsible for parsing raw fields into complex structured fields) 
_some_ duplication of field parsing seems unavoidable. Core parser needs 
content-type, content-transfer-encoding, charset and boundary bits in order to 
be able to decode mime entities. This can also lead to potential 
inconsistencies in handling of malformed fields (as the one recently reported 
by Stefano): the default message builder may succeed in building an object 
model for a particular message, but the default message formatter may fail when 
serialising the very same model, because some fields get re-parsed using a 
stricter routine.

If we did move Field and FieldPaser interfaces to Core, however, not only could 
we avoid duplicate parsing of some headers, but we could also potentially 
simplify the API by getting rid of RawField class and potentially 
Maximal/DefaultBodyDescriptors. All fields would get a parser assigned to them 
as soon as they are read  from the MIME stream and would only need to be parsed 
once when accessed (if at all). Body descriptors could also be built lazily 
from properties of individual fields. They would no longer be a reason for 
having reduced (default) body descriptors and maximal ones. 

If I hear no objections, I'll go ahead and experiment with the idea of moving 
field parsing interfaces to Core.

Oleg

> Avoid duplicate parsing of header fields
> ----------------------------------------
>
>                 Key: MIME4J-116
>                 URL: https://issues.apache.org/jira/browse/MIME4J-116
>             Project: JAMES Mime4j
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Markus Wiederkehr
>             Fix For: 0.7
>
>
> Currently some header fields are parsed twice when building a DOM. Once by 
> DefaultBodyDescriptor or MaximalBodyDescriptor and a second time by 
> MessageBuilder using Field.parse().
> Also different parsers are used in both stages. The body descriptors use 
> handcrafted parsers whereas Field.parse uses JavaCC generated parsers. The 
> handcrafted version does not seem to handle comments in a header correctly.
> The situation should be improved by parsing a header field only once and 
> passing that already parsed field to a content handler. Also only one sort of 
> field parser should be used; either handcrafted or generated. My personal 
> opinion is that it might be easier for a handcrafted parser to be more 
> tolerant against malformed header fields.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to