Paul J Stevens <[EMAIL PROTECTED]> said: >>>> This structure will make it very easy to query all the headers from a >>>> given message or find all the messages with a given header, or a >>>> given header value. It also leaves our current structure intact >>>> which will make it easier to phase in. > > It will work, but you still require a lookup on an extra table to do > header searches. There has to be hidden cost there. Also the added > db-interaction at insertion time is very significant.
We could probably do it in one query: SELECT header, header_idnr FROM dbmail_headers WHERE header IN ('To', 'From', 'Subject', etc...) [snip] > preparse certain headers for common attributes: the in-reply-to and > references headers will be used for threading. These headers contain > one or more message-id header values, which are stored separately. > With this approach building message-threads can be done with fully > indexed, single-table queries! Of course a union with the msgids from > the opened mailbox is still required, but you can't beat such a setup > wrt threading. Neato! [snip] >>>> What do you think? I don't think we need to special case any headers >>>> not even sendername or subject. > > The more I think about this, the more convinced I am that we do need to > treat common headers differently in the end. And your suggestion to > preseed the header_names table and hardcode their ids tells me we're > agreed here. I'm skeptical. If we can remain generic at similar cost, let's go generic. Performance numbers showing that special casing is *significantly* faster should be a hard requirement here. [snip] > If I want to store mime-part headers the same as message-headers, first > thing I have to do is change the physmessage_id references in the header > tables to messageblk_idnr references. Next I would have to store mime-part > headers same as message-headers. Is that going to be smart enough to know *which* mime part is in which messageblk? I'm still only getting a mental picture of knowing which messageblks correspond to which physmessage and not the association of mime parts to messageblks. Wouldn't we need one more layer of relation table? > Finally I would have to change the messageblks model > > from: > > block[0]: message-header > block[1]: message-body[slice1] > block[N]: message-body[sliceN] > > to > > block[0]: raw message-headers > block[1]: mime-part[0] > block[N]: mime-part[N] block[0]: raw message-headers block[1]: mime-part[0][slice1] block[N]: etc. If we decide to keep the slicing model for large mime parts. > Where the only really tricky part is converting an existing dbmail > storage to this modified approach. Still, way cool stuff. Reparse everything! It'll be fun to write the migration tool :-P Aaron