Eric Soroos wrote:


I believe your choice is better, in that, it requires many less changes to the underlying db_ methods, so I'll back mine out and proceed with yours. I
may need one new db_ method to add a user mailbox to an existing message.


I think there's a way to do this entirely in the the database layer, at least for postgresql. Mysql would have to wait for 5.0 I think.

Essentially, fingerprint each message block with a crypto hash, and store that as a unique key in a table. If there's duplicate on insert, the rule would just swallow the row and return the key. A view could be put together for the current messageblks table, but for most of the access, its not really a useful construct. Or maybe it is depending on how well pg pushes the query clauses down into the view definition.

snip

There are a few keys here:
    there is a copy of the message flags row for every message*recipients.
there is a copy of the mapping between the message_idnr and the blocks for every message*recipient.
    there is one copy of any identical block.

So this would support different header blocks attached to cc/bcc'd messages where the body is the same. It would support multiple independent delivery. It would support fully identical delivery as well.

eric


Silly question,
Why can't you store the msg_block fingerprint in the msg_block table? Like (with mysql syntax)
CREATE TABLE messageblks (
        messageblk_idnr bigint(21) NOT NULL auto_increment,
        physmessage_id bigint(21) NOT NULL default '0',
        messageblk longtext NOT NULL,
        blocksize bigint(21) NOT NULL default '0',
===> fingerprint bigint(21),
        PRIMARY KEY  (messageblk_idnr),
        KEY physmsg_index (physmessage_id),
===> KEY (fingerprint),
        FOREIGN KEY (`physmessage_id`) REFERENCES `physmessage`
                (`id`) ON DELETE CASCADE
===> UNIQUE (messageblock, fingerprint),
        ) TYPE=InnoDB;

The above is from http://cvs.dbmail.org/cgi-bin/cvsweb.cgi/~checkout~/dbmail/sql/mysql/create_tables_innoDB.mysql?rev=1.4.4.4&content-type=text/plain
I have marked my additional lines with "===>"
Is it more efficient to make searches in a table with nonvariant record sizes?

/Magnus



Reply via email to