On Mon, 2014-09-29 at 15:59 +0200, Wolfgang Fahl wrote: > Hi Eric, Ioan, Oleg and others, >
... > Now I was hoping to be able to test this fix. I assume I have to add > some test message to: > core: > src/test/resources/testmsgs > > But to really check the new behaviour they'd have to be three different > tests: > 1. check invalid mimeCharset in lenient mode - will work with default > Charset > 2. check invalid mimeCharset in non-lenient mode - will throw exception > 3. check invalid mimeCharset in non-lenient mode with overridden > resolveCharset - will work with chosen mapped Charset. > A plain vanilla JUnit will do. > Please let me know how I can add these tests and how get a proper > patchset going. I don't work much with subversion theses days - > i prefer to use git. > You are welcome to open a PR at github and reference it from JIRA https://github.com/apache/james-mime4j Oleg > Cheers > > Wolfgang > > Am 10.08.14 um 10:33 schrieb Stan Ioan Eugen: > > Hello Wolfgang, > > > > Sorry for my late reply. I've created a Jira ticket to track this > > issue. As Eric suggested, it's the right way to do get code into the > > project. > > I've looked over the code and it looks good in general. I would keep > > both variants of the regular expression to match FROM lines, with a > > good javadoc, so users can use any of them in their code. I would > > also move the 'mbox != null' check inside the constructor - this way > > we make sure we don't create an object in an inconsistent state. > > > > I will be more than happy to push the patch upstream once we have some > > tests for the new behavior. Are you interested in providing the tests? > > > > Please use the issue for patch submission and relevant comments. > > https://issues.apache.org/jira/browse/MIME4J-242 > > > > Thanks, > > > > > > 2014-08-03 10:52 GMT+03:00 Eric Charles <[email protected]>: > >> Could you open on JIRA on https://issues.apache.org/jira/browse/MIME4J > >> and upload there your patch? Thx. > >> > >> On 07/23/2014 09:57 AM, Wolfgang Fahl wrote: > >>> Hi Ioan Eugen, > >>> > >>> please find attached a patch. > >>> > >>> it uses the following fromline pattern: > >>> static final String DEFAULT = "^From \\S+.*\\d{4}$"; > >>> so that it matches more lines. > >>> 1. From [email protected] Fri Sep 09 14:04:52 2011 > >>> 2. From MAILER-DAEMON Wed Oct 05 21:54:09 2011 > >>> 3. From - Wed Apr 02 06:51:08 2014 > >>> > >>> so looking for an "@" sign is not enforced any more. > >>> > >>> The patch fixes a typo: > >>> - private Matcher fromLineMathcer; > >>> + private Matcher fromLineMatcher; > >>> > >>> in many places of the source code. > >>> > >>> It adds a reference to the original mbox File so that the error message: > >>> + if (mbox!=null) > >>> + path=mbox.getPath(); > >>> + throw new IllegalArgumentException("File "+path+" does not > >>> contain From_ lines that match the pattern > >>> '"+MESSAGE_START.pattern()+"'! Maybe not be a valid Mbox."); > >>> > >>> can be improved. > >>> > >>> Who is going to check this patch and what needs to be done to get it > >>> into the official repo? > >>> I would also like to add more test cases and especially include some > >>> dummy mboxes. And as mentioned I'd like to check the iterator against > >>> all my Thunderbird mboxes to check > >>> whether it will successfully parse them all. Also I am offering to write > >>> a few "tutorial lines". Where would I have to put these? > >>> > >>> Cheers > >>> Wolfgang > >>> > >>> Am 22.07.14 22:23, schrieb Ioan Eugen Stan: > >>>> Hello Wolfgang, > >>>> > >>>> I developed MailboxIterator. It's nice to see that it's helpful :) > >>>> > >>>> You get that error because MboxIterator does not know how to split the > >>>> messages. Messages in an mbox file are separated via lines that start > >>>> with '' From:'. They are called (by me at least) 'From lines' :) . > >>>> One problem with the mbox format is that it's a bit 'free-form' in the > >>>> sense that developers abused it and we have some variants [1]. > >>>> > >>>> One thing that you could try is to supply a different From line > >>>> regular expression to MboxIterator via regexpPattern argument. It will > >>>> split messages based on this new value. > >>>> > >>>> [1] http://wiki2.dovecot.org/MailboxFormat/mbox > >>>> > >>>> Good luck and please post the your results. > >>>> > >>>> Regards, > >>>> > >>>> On Fri, Jul 18, 2014 at 12:53 PM, Wolfgang Fahl <[email protected]> wrote: > >>>>> Dear mime4j developers, > >>>>> > >>>>> for one of my projects I have been using mime4j successfully to import > >>>>> e-mail into our CRM database for some two years know. > >>>>> Currently I am trying to add a feature which would allow reading Mozilla > >>>>> Thunderbird Mailbox content. > >>>>> As of mime4j 0.8 there seems to be a MboxIterator which could do that. > >>>>> Since I didn't find any publicly available source repository which I > >>>>> could use to access the 0.8-Snapshop I have copied > >>>>> the three source files: > >>>>> * CharBufferWrapper.java > >>>>> * FromLinePatterns.java > >>>>> * MboxIterator.java > >>>>> > >>>>> into my source tree and I am using these together with the following > >>>>> maven dependency: > >>>>> > >>>>> <!-- EMail handling --> > >>>>> <dependency> > >>>>> <groupId>org.apache.james</groupId> > >>>>> <artifactId>apache-mime4j-core</artifactId> > >>>>> <version>0.7.2</version> > >>>>> </dependency> > >>>>> <dependency> > >>>>> <groupId>org.apache.james</groupId> > >>>>> <artifactId>apache-mime4j-dom</artifactId> > >>>>> <version>0.7.2</version> > >>>>> </dependency> > >>>>> > >>>>> The iterator works somewhat o.k. on some of the Thunderbird mailbox > >>>>> files and loops thru the mails in it correctly. > >>>>> The mails can than not be directly parsed with mime4j - there is one > >>>>> newline at the begining which spoils the show. After > >>>>> working around this it's working as expected in some cases. In other > >>>>> cases there is an error: > >>>>> > >>>>> java.lang.IllegalArgumentException: File does not contain From_ lines! > >>>>> Maybe not be a vaild Mbox. > >>>>> at > >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator.initMboxIterator(MboxIterator.java:85) > >>>>> at > >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator.<init>(MboxIterator.java:75) > >>>>> at > >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator.<init>(MboxIterator.java:62) > >>>>> at > >>>>> org.apache.james.mime4j.mboxiterator.MboxIterator$Builder.build(MboxIterator.java:241) > >>>>> at > >>>>> com.bitplan.clientutils.ThunderbirdMailArchiveImpl.getMailById(ThunderbirdMailArchiveImpl.java:386) > >>>>> at > >>>>> com.bitplan.clientutils.ThunderbirdMailArchiveImpl.getMailById(ThunderbirdMailArchiveImpl.java:261) > >>>>> at > >>>>> com.bitplan.clientutils.rest.TestMailAccess.testMailById(TestMailAccess.java:77) > >>>>> > >>>>> By the way - there is a typo in the above error message "vaild" should > >>>>> be "valid". > >>>>> > >>>>> The error is something I'd like to fix or work-around. > >>>>> > >>>>> I have two big user accounts with several hundred mailbox files and some > >>>>> 300.000 mails from the last 15 years which I'd like > >>>>> to use as a testcase against which to run the mime4j implementation. > >>>>> > >>>>> Would you please supply me with some pointers where I get the necessary > >>>>> source code and how i could supply patches and > >>>>> testcases for the project? > >>>>> > >>>>> Also it would be good to know whether others would be interested in the > >>>>> Thunderbird Mailbox reading capability. > >>>>> > >>>>> > >>>>> Cheers > >>>>> Wolfgang > >>>>> > >>>>> -- > >>>>> > >>>>> BITPlan - smart solutions > >>>>> Wolfgang Fahl > >>>>> Pater-Delp-Str. 1, D-47877 Willich Schiefbahn > >>>>> Tel. +49 2154 811-480, Fax +49 2154 811-481 > >>>>> Web: http://www.bitplan.de > >>>>> BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, > >>>>> Geschäftsführer: Wolfgang Fahl > >>>>> > >>>> > > > > >
