Re: [jira] Updated: (MIME4J-118) MIME stream parser handles non-ASCII fields incorrectly

Oleg Kalnichevski Tue, 17 Feb 2009 11:37:31 -0800

Markus Wiederkehr wrote:

Let me reply to the mailing list to keep the discussion on the detail
level away from the JIRA..


On Tue, Feb 17, 2009 at 12:29 PM, Oleg Kalnichevski (JIRA)
<[email protected]> wrote:

    [ 
https://issues.apache.org/jira/browse/MIME4J-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Kalnichevski updated MIME4J-118:
-------------------------------------

   Attachment: mime4j-118.patch

Here is the first cut at fixing the problem with charset coding in MIME fields. 
The significant changes are

(1) EntityStateMachine#getField() returns ByteArrayBuffer instead of a String
(2) ContentHandler#field takes ByteArrayBuffer instead of a String


Your patch does not look bad, but:

I have a feeling that it will not help with MIME4J-116, maybe on the
contrary. You say you do not want AbstractEntity to create Field

instances yet you give no rationalization for your opinion..


I believe I did, but I can certainly re-iterate.

Tight coupling is believed to be bad, I _personally_ tend to agree. Bycoupling EntityStateMachine with the Field class we will end up couplingit, albeit indirectly, with pretty much the entireorg.apache.james.mime4j.field package

EntityStateMachine -> Field -> DefaultFieldParser -> several concreteimplementations of FieldParser


This is a very _bad_ idea, in my opinion.

I could live with it, if Field were an interface.

I don't

see how the ByteArrayBuffer can help with the unnecessary duplicate
field parsing.


It was not meant to. To me, these two issues are related, but not the same.

I have no problem with small changes one at a time. But it does not
hurt to keep an eye on related issues in the process.

Also I don't like that ByteArrayBuffer is a mutable class (and yes, I
am aware that byte[] is mutable too). Please consider that Field is
designed to be immutable. This is an important aspect because a Field
may be shared between multiple messages. So if you refactor Field to
store a ByteArrayBuffer please make sure that the ByteArrayBuffer does
not get exposed publicly.

Maybe we should introduce an immutable object that holds a byte array;
similar to what String is for a character array. That would still not
help with #116 though..

Yes, this is a good idea, but in this case the class would have to makea copy of the byte array _every_ time someone wanted to get access tothe underlying raw content, for instance, in order to parse it using adifferent parser or write it out to an OutputStream. This is the onlyway I can see to ensure immutability of such a class. Interestinglyenough, this can get quite expansive if occurs frequently.


We solved a similar issue in HttpCore in the following way.

We have Header interface that represents an immutable HTTP header and wehave FormattedHeader interface extending Header that provides access tothe underlying CharArrayBuffer. One would have to cast Header toFormattedHeader in order to be able to mutate its content.


Oleg

Markus

If these changes are okay with everyone I'll proceed with refactoring and 
change Field and friends to utilize ByteArrayBuffer for storing raw field 
content.

Oleg

MIME stream parser handles non-ASCII fields incorrectly
-------------------------------------------------------

                Key: MIME4J-118
                URL: https://issues.apache.org/jira/browse/MIME4J-118
            Project: JAMES Mime4j
         Issue Type: Bug
           Reporter: Oleg Kalnichevski
           Assignee: Oleg Kalnichevski
            Fix For: 0.6

        Attachments: mime4j-118.patch


Presently MIME stream parser handles non-ASCII fields incorrectly. Binary field 
content gets converted to its textual representation too early in the parsing 
process using simple byte to char cast. The decision about appropriate char 
encoding should be left up to individual ContentHandler implementations.
Oleg

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Updated: (MIME4J-118) MIME stream parser handles non-ASCII fields incorrectly

Reply via email to