On 8/11/07, Zsombor <[EMAIL PROTECTED]> wrote: > > > On 8/10/07, Robert Burrell Donkin <[EMAIL PROTECTED]> wrote: > > On 8/10/07, Zsombor <[EMAIL PROTECTED]> wrote:
<snip> > > (i'm interested in JDO and would be much more likely to contribute to > > a OpenJPA implementation than to a plain hibernate one. the OpenJPA > > team is also very friendly so i'm sure that they'd be happy to help > > out with architecture.) > > > > > I have some question about the current IMAP code in the trunk. I know > it's > > > highly experimental code, and never released, but I'm curious to ask > that do > > > you think that the current API with the > > > > MailboxManagerProvider/MailboxManager/ImapMailboxSession > will see > > > revolutionary, or evolutionary changes ? I mean, that do you plan to > rewrite > > > it from scratch, or only minor method additions/deletions will occur? I > > > know, you cant promise anything, but I dont want to waste my time, if > > > someone totally rewrote the backend interface in the last few days, and > > > intends to commit it in the near future :) > > > > (i had it mind to move the mailbox code out from core into it's own > > module. we've also talked about renaming some of the interfaces. i'll > > start a thread on this today.) > > > > i've given up trying to make IMAP perform with the torque mailbox > > implementation. this is partly down to an inefficient table structure > > but mostly down to inefficiencies baked into the API (common IMAP > > operations take numerous database calls to execute and bulky message > > data is too often fetched). > > What do you think, which operations is the most common ones? the IMAP protocol has a lot of redundancy built in so this depends on the client :-/ i can produce good statistics about evolution but i know that other clients are quite different here's a typical use case User opens email client which performs: * LOGIN to IMAP server * SELECT on a folder * FETCH basic meta data for all messages * FETCH structural data for all new messages User reads an seen part of a message: * FETCH basic meta-data (client has already cache mail content) User reads an unseen part of a MIME multipart message * FETCH unseen part of message (and cache) User reads an unseen single part message * FETCH message User reads an unseen MIME mulipart message * FETCH meta-data and one part User moves a message into the folder from INBOX: * APPEND message User clicks on another mailbox * SELECT on a folder * FETCH basic meta data for all messages * FETCH structural data for all new messages in reality, most client do a lot more than this but conceptually, this is reasonably accurate IMHO there are three groups of calls which are critical to user perception of performance. the first is mailbox selection, the second is message meta-data, the third is message content. IMAP is an unusual protocol and creates challenges in all three areas > Currently I'm > trying to figure out what is the main difference between the UID, MSN and > KEY value, which is unique to the mailbox, and which to the whole > repository. IMHO the mailbox design suffers from being a compromise between a good IMAP API and a good general API. the interface and implementation is over-complicated but there are many features which are likely to be IMAP specific in what was intended to be a general API. MSN and UID are governed by the IMAP specification. KEY is general. one of the challenges is that IMAP specifies two unique indexes: UID and MSN. both UID and MSN are unique only within a mailbox. it is tempting to use UID as a primary key but this would limit the size of mailboxes to less that the specification. it may be possible to use a computed PK by using byte arithmetic for mailbox and UID. > My main concern of the current torque backend is that it's > currently try to check the modifications in-VM, and deliver the notification > synchronously, instead of a polling db-thread, which i think should be > better. (using a JCR allow registration for events rather than polling but that's just a detail) the synchronous notification stuff ATM is a side channel for data changed by the current client. in other words, it is used to inform the client when it has altered data. the session-based design does not allow the client to be informed about changes made by other clients. i would much prefer an event driven approach to asynchronous notifications > (So multiple IMAP frontends can be deployed for one backend > database). this is not trivial for IMAP for example, maintaining message numbers is challenging. one possible approach would be to avoid storing message numbers in the database and maintain them in each IMAP frontend. > > i'm interested in JCP backends (rather than DB) so i no plans to fix > > these problems. my plan for the experimental code is to introduce a > > new API and provide an adapter for the mailbox API (rather than > > rewrite it). i hope that this would allow mailbox implementations > > which wish to optimise themselves for IMAP would be able to do so but > > a slow but working IMAP would be possible even with a plain > > implementation. > > > > that isn't to say that the mailbox API is fixed. it's in need of a > > review. there's a lot of unintuitive naming and lots is not javadoc'd. > > i suspect that you'll find as you implement that you'll find issues > > and inefficiencies. > > > Yeah, i found some. For example the > getMailboxManager(user).createInbox(user) call. Or > something like that. a good example is IMAP SELECT. this requires 8 separate calls to the mailbox API each of which makes multiple calls to the database. several of these calls require multiple table joins. SELECT is very slow. i would prefer an API with a single call which returns meta-data about the mailbox using a fetch group to specify what data is required. - robert --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
