Hello Wolfgang, I developed MailboxIterator. It's nice to see that it's helpful :)
You get that error because MboxIterator does not know how to split the messages. Messages in an mbox file are separated via lines that start with '' From:'. They are called (by me at least) 'From lines' :) . One problem with the mbox format is that it's a bit 'free-form' in the sense that developers abused it and we have some variants [1]. One thing that you could try is to supply a different From line regular expression to MboxIterator via regexpPattern argument. It will split messages based on this new value. [1] http://wiki2.dovecot.org/MailboxFormat/mbox Good luck and please post the your results. Regards, On Fri, Jul 18, 2014 at 12:53 PM, Wolfgang Fahl <w...@bitplan.com> wrote: > Dear mime4j developers, > > for one of my projects I have been using mime4j successfully to import > e-mail into our CRM database for some two years know. > Currently I am trying to add a feature which would allow reading Mozilla > Thunderbird Mailbox content. > As of mime4j 0.8 there seems to be a MboxIterator which could do that. > Since I didn't find any publicly available source repository which I > could use to access the 0.8-Snapshop I have copied > the three source files: > * CharBufferWrapper.java > * FromLinePatterns.java > * MboxIterator.java > > into my source tree and I am using these together with the following > maven dependency: > > <!-- EMail handling --> > <dependency> > <groupId>org.apache.james</groupId> > <artifactId>apache-mime4j-core</artifactId> > <version>0.7.2</version> > </dependency> > <dependency> > <groupId>org.apache.james</groupId> > <artifactId>apache-mime4j-dom</artifactId> > <version>0.7.2</version> > </dependency> > > The iterator works somewhat o.k. on some of the Thunderbird mailbox > files and loops thru the mails in it correctly. > The mails can than not be directly parsed with mime4j - there is one > newline at the begining which spoils the show. After > working around this it's working as expected in some cases. In other > cases there is an error: > > java.lang.IllegalArgumentException: File does not contain From_ lines! > Maybe not be a vaild Mbox. > at > org.apache.james.mime4j.mboxiterator.MboxIterator.initMboxIterator(MboxIterator.java:85) > at > org.apache.james.mime4j.mboxiterator.MboxIterator.<init>(MboxIterator.java:75) > at > org.apache.james.mime4j.mboxiterator.MboxIterator.<init>(MboxIterator.java:62) > at > org.apache.james.mime4j.mboxiterator.MboxIterator$Builder.build(MboxIterator.java:241) > at > com.bitplan.clientutils.ThunderbirdMailArchiveImpl.getMailById(ThunderbirdMailArchiveImpl.java:386) > at > com.bitplan.clientutils.ThunderbirdMailArchiveImpl.getMailById(ThunderbirdMailArchiveImpl.java:261) > at > com.bitplan.clientutils.rest.TestMailAccess.testMailById(TestMailAccess.java:77) > > By the way - there is a typo in the above error message "vaild" should > be "valid". > > The error is something I'd like to fix or work-around. > > I have two big user accounts with several hundred mailbox files and some > 300.000 mails from the last 15 years which I'd like > to use as a testcase against which to run the mime4j implementation. > > Would you please supply me with some pointers where I get the necessary > source code and how i could supply patches and > testcases for the project? > > Also it would be good to know whether others would be interested in the > Thunderbird Mailbox reading capability. > > > Cheers > Wolfgang > > -- > > BITPlan - smart solutions > Wolfgang Fahl > Pater-Delp-Str. 1, D-47877 Willich Schiefbahn > Tel. +49 2154 811-480, Fax +49 2154 811-481 > Web: http://www.bitplan.de > BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, > Geschäftsführer: Wolfgang Fahl > -- Ioan Eugen Stan 0720 898 747