On Mon, 2014-09-29 at 15:59 +0200, Wolfgang Fahl wrote:
> Hi Eric, Ioan, Oleg and others,
> 

...

> Now I was hoping to be able to test this fix. I assume I have to add
> some test message to:
> core:
>    src/test/resources/testmsgs
> 
> But to really check the new behaviour they'd have to be three different
> tests:
> 1. check invalid mimeCharset in lenient mode - will work with default
> Charset
> 2. check invalid mimeCharset in non-lenient mode - will throw exception
> 3. check invalid mimeCharset in non-lenient mode with overridden
> resolveCharset - will work with chosen mapped Charset.
> 

A plain vanilla JUnit will do.

> Please let me know how I can add these tests and how get a proper
> patchset going. I don't work much with subversion theses days -
> i prefer to use git.
> 

You are welcome to open a PR at github and reference it from JIRA

https://github.com/apache/james-mime4j

Oleg

> Cheers
> 
> Wolfgang
> 
> Am 10.08.14 um 10:33 schrieb Stan Ioan Eugen:
> > Hello Wolfgang,
> >
> > Sorry for my late reply.  I've created a Jira ticket to track this
> > issue. As Eric suggested, it's the right way to do get code into the
> > project.
> > I've looked over the code and it looks good in general. I would keep
> > both variants of the regular expression to match FROM lines, with  a
> > good  javadoc, so users can use any of them in their code. I would
> > also move the 'mbox != null' check inside the constructor - this way
> > we make sure we don't create an object in an inconsistent state.
> >
> > I will be more than happy to push the patch upstream once we have some
> > tests for the new behavior. Are you interested in providing the tests?
> >
> > Please use the issue for patch submission and relevant comments.
> > https://issues.apache.org/jira/browse/MIME4J-242
> >
> > Thanks,
> >
> >
> > 2014-08-03 10:52 GMT+03:00 Eric Charles <[email protected]>:
> >> Could you open on JIRA on https://issues.apache.org/jira/browse/MIME4J
> >> and upload there your patch? Thx.
> >>
> >> On 07/23/2014 09:57 AM, Wolfgang Fahl wrote:
> >>> Hi Ioan Eugen,
> >>>
> >>> please find attached a patch.
> >>>
> >>> it uses the following fromline pattern:
> >>> static final String DEFAULT = "^From \\S+.*\\d{4}$";
> >>> so that it matches more lines.
> >>> 1. From [email protected] Fri Sep 09 14:04:52 2011
> >>> 2. From MAILER-DAEMON Wed Oct 05 21:54:09 2011
> >>> 3. From - Wed Apr 02 06:51:08 2014
> >>>
> >>> so looking for an "@" sign is not enforced any more.
> >>>
> >>> The patch fixes a typo:
> >>> -    private Matcher fromLineMathcer;
> >>> +    private Matcher fromLineMatcher;
> >>>
> >>> in many places of the source code.
> >>>
> >>> It adds a reference to the original mbox File so that the error message:
> >>> +                 if (mbox!=null)
> >>> +                       path=mbox.getPath();
> >>> +            throw new IllegalArgumentException("File "+path+" does not
> >>> contain From_ lines that match the pattern
> >>> '"+MESSAGE_START.pattern()+"'! Maybe not be a valid Mbox.");
> >>>
> >>> can be improved.
> >>>
> >>> Who is going to check this patch and what needs to be done to get it
> >>> into the official repo?
> >>> I would also like to add more test cases and especially include some
> >>> dummy mboxes. And as mentioned I'd like to check the iterator against
> >>> all my Thunderbird mboxes to check
> >>> whether it will successfully parse them all. Also I am offering to write
> >>> a few "tutorial lines". Where would I have to put these?
> >>>
> >>> Cheers
> >>>   Wolfgang
> >>>
> >>> Am 22.07.14 22:23, schrieb Ioan Eugen Stan:
> >>>> Hello Wolfgang,
> >>>>
> >>>> I developed MailboxIterator. It's nice to see that it's helpful :)
> >>>>
> >>>> You get that error because MboxIterator does not know how to split the
> >>>> messages. Messages in an mbox file are separated via lines that start
> >>>> with '' From:'. They are called (by me at least) 'From lines' :) .
> >>>> One problem with the mbox format is that it's a bit 'free-form' in the
> >>>> sense that developers abused it and we have some variants [1].
> >>>>
> >>>> One thing that you could try is to supply a different From line
> >>>> regular expression to MboxIterator via regexpPattern argument. It will
> >>>> split messages based on this new value.
> >>>>
> >>>> [1] http://wiki2.dovecot.org/MailboxFormat/mbox
> >>>>
> >>>> Good luck and please post the your results.
> >>>>
> >>>> Regards,
> >>>>
> >>>> On Fri, Jul 18, 2014 at 12:53 PM, Wolfgang Fahl <[email protected]> wrote:
> >>>>> Dear mime4j developers,
> >>>>>
> >>>>> for one of my projects I have been using mime4j successfully to import
> >>>>> e-mail into our CRM database for some two years know.
> >>>>> Currently I am trying to add a feature which would allow reading Mozilla
> >>>>> Thunderbird Mailbox content.
> >>>>> As of mime4j 0.8 there seems to be a MboxIterator which could do that.
> >>>>> Since I didn't find any publicly available source repository which I
> >>>>> could use to access the 0.8-Snapshop I have copied
> >>>>> the three source files:
> >>>>> * CharBufferWrapper.java
> >>>>> * FromLinePatterns.java
> >>>>> * MboxIterator.java
> >>>>>
> >>>>> into my source tree and I am using these together with the following
> >>>>> maven dependency:
> >>>>>
> >>>>> <!-- EMail handling -->
> >>>>>         <dependency>
> >>>>>             <groupId>org.apache.james</groupId>
> >>>>>             <artifactId>apache-mime4j-core</artifactId>
> >>>>>             <version>0.7.2</version>
> >>>>>         </dependency>
> >>>>>         <dependency>
> >>>>>             <groupId>org.apache.james</groupId>
> >>>>>             <artifactId>apache-mime4j-dom</artifactId>
> >>>>>             <version>0.7.2</version>
> >>>>>         </dependency>
> >>>>>
> >>>>> The iterator works somewhat o.k. on some of the Thunderbird mailbox
> >>>>> files and loops thru the mails in it correctly.
> >>>>> The mails can than not be directly parsed with mime4j - there is one
> >>>>> newline at the begining which spoils the show. After
> >>>>> working around this it's working as expected in some cases. In other
> >>>>> cases there is an error:
> >>>>>
> >>>>> java.lang.IllegalArgumentException: File does not contain From_ lines!
> >>>>> Maybe not be a vaild Mbox.
> >>>>>     at
> >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator.initMboxIterator(MboxIterator.java:85)
> >>>>>     at
> >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator.<init>(MboxIterator.java:75)
> >>>>>     at
> >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator.<init>(MboxIterator.java:62)
> >>>>>     at
> >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator$Builder.build(MboxIterator.java:241)
> >>>>>     at
> >>>>> com.bitplan.clientutils.ThunderbirdMailArchiveImpl.getMailById(ThunderbirdMailArchiveImpl.java:386)
> >>>>>     at
> >>>>> com.bitplan.clientutils.ThunderbirdMailArchiveImpl.getMailById(ThunderbirdMailArchiveImpl.java:261)
> >>>>>     at
> >>>>> com.bitplan.clientutils.rest.TestMailAccess.testMailById(TestMailAccess.java:77)
> >>>>>
> >>>>> By the way - there is a typo in the above error message "vaild" should
> >>>>> be "valid".
> >>>>>
> >>>>> The error is something I'd like to fix or work-around.
> >>>>>
> >>>>> I have two big user accounts with several hundred mailbox files and some
> >>>>> 300.000 mails from the last 15 years which I'd like
> >>>>> to use as a testcase against which to run the mime4j implementation.
> >>>>>
> >>>>> Would you please supply me with some pointers where I get the necessary
> >>>>> source code and how i could supply patches and
> >>>>> testcases for the project?
> >>>>>
> >>>>> Also it would be good to know whether others would be interested in the
> >>>>> Thunderbird Mailbox reading capability.
> >>>>>
> >>>>>
> >>>>> Cheers
> >>>>>   Wolfgang
> >>>>>
> >>>>> --
> >>>>>
> >>>>> BITPlan - smart solutions
> >>>>> Wolfgang Fahl
> >>>>> Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
> >>>>> Tel. +49 2154 811-480, Fax +49 2154 811-481
> >>>>> Web: http://www.bitplan.de
> >>>>> BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, 
> >>>>> Geschäftsführer: Wolfgang Fahl
> >>>>>
> >>>>
> >
> >
> 


Reply via email to