On Sun, Mar 18, 2012 at 10:55:19AM +0530, Aamir Khan wrote: > On Sun, Mar 18, 2012 at 4:24 AM, Barry Warsaw <ba...@list.org> wrote: > > > On Mar 18, 2012, at 12:23 AM, Aamir Khan wrote: > > > > >On Fri, Feb 17, 2012 at 12:55 AM, Barry Warsaw <ba...@list.org> wrote: > > >> On IRC, we talked about a storm + Python mailbox library based backend, > > >> with a > > >> REST+JSON wsgi based application vending the data. This would allow us > > to > > >> integrate fairly easily with MM3 I think, and would possibly better > > enable > > >> some of the archiver work being done by Terri and others. > > >> > > > > > >I understand that we will store the messages in .mbox format. But I don't > > >understand why do we need to use storm for the archiving purpose. > > > > I meant to say "maildir". Please let's not use mbox format! It's way too > > easy to corrupt the file, as we did with a bug once in MM2.1, and we've > > paid > > the price ever since. > > > > I read the difference between maildir and mbox format and it clearly states > that mbox is prone to corruption while maildir is not. Also there are more > advantages using maildir in a way that there is no file locking problem. > But since we will be storing each mail in a separate file, searching > through them will not as fast enough. Using database alone also have > problems like, it will use more hard disk, more CPU cycles will be consumed. > > So, if we can store the messages in maildir format with a copy of it it > database. we can serve the searching request using database query which > will powered by full-text search engine. But then there will be problems of > synchronization between the maildir messages and messages stored in > database. What are your thoughts about it ? > > As for searching the archive, there are solutions like Elastic Search, > Solr, lucene. Can we use one of them to search directly through the maildir. > Note that a few of us have been playing with a searching-archiver. An initial prototype used notmuch. We looked into using raw xapian at pycon. And now, one of our developers (pingou on IRC) has pushed out a prototype that uses mongodb for the backend.
You can take a look at our development copy here: http://mm3test.fedoraproject.org/2/list/devel@fp.o I'll be working on splitting out a tested copy from an in-development copy later today. That way we won't be creating web pages with tracebacks all the time :-) Code for this is available in the hyperkitty mongodb branch: bzr branch bzr://bzr.fedorahosted.org/bzr/hyperkitty/mongodb -Toshio
pgptBT7Zav7b8.pgp
Description: PGP signature
_______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9