On 7/12/06, Jukka Zitting <[EMAIL PROTECTED] > wrote:
Hi,
hi On 7/12/06, robert burrell donkin < [EMAIL PROTECTED]> wrote:
> it strikes me that it's more natural to store emails plus meta-data in a content > repository such as jackrabbit rather than in a classic RDBMS. > [...] > 1 is jackrabbit really a good match? I think so. Although email storage is not inherently hierarchical (unless you opt for a folder structure), using a content repository does give you a set of nice features like built-in full text indexing and a flexible data model that would be hard to achieve with a relational database.
the folder metaphor is widely used today. most clients use it extensively. many mailing list archives are also organised in hierachical fashion. so, integrating with existing email clients means supporting a hierarchy. one POV is that everything is meta-data. presenting meta-data query results as folders would be powerful. but many people like to be able to do ad hoc organization and so i suspect that it may turn out to be cleaner to support primary folders outside the meta-data. so, a store which naturally supports hierarchical organisation is a plus. (until i looked into it) i didn't realise that jackrabbit supported webDAV. this is interesting since i think that using a webDAV email vocabulary (analogous to calDAV) as an alternative to IMAP has some real advantages for the kinds of application i'm interested in. there's some interest over at the IETF on this subject. would jackrabbit be able to provide a reasonble prototyping environment (in the sense of being able to try concepts without a lot of ceremony for email over webDAV)...? (i know that i'm going to need to do a lot more reading around webDAV support in jackrabbit)
2 is there anyone already working in this space? I had similar thoughts two years ago with RDBMs, but I never took the idea further than creating the data model and some initial email importers. I looked at Apache James as a possible tool to work with, but the barrier of entry was too high for my limited amount of time at that time.
i've promised noel that i'll help out with openpgp on james so i'm learning james anyway... Later on I went with GMail and most of the itch was gone. i have 5G of email archives stored in IMAP but i'm fed up of having to port the hundreds of filters i use to organise my emails when i change clients. this limits my ability to use different machines for development and my mobility. for a long while, i thought i needed better filtering. but i'm really after something more general than delivering mail into particular folders. it's really about being able to run rules to associate meta-data with emails. some of these tags would be a little bit like a gmail labels. others would be the usual standard headers and so on that come with the emails. See http://yukatan.sourceforge.net/sql/yukatan.html for the relational
data model I created.
i'm having problems accessing some of the content but i see the general idea :-)
3 any interest in developing these ideas further? Yes. I'm not totally happy with GMail, and every now and then I've been playing with the idea of porting the relational model to JCR node types
great :-) i'd be very glad to help out if you want to develop the port on list but i'd like to try to triangulate data types with james developers (probably when i'm a little further on) and the ietf working group. that sound ok to you? and the import tools to use the JCR API. which import tools? - robert