I don't even know where to begin in terms of designing the delivery chain
around transactions. Could we do it as simply as adding functions...
void db_begin_transaction(void);
void db_flush_transaction(void); (or db_commit_...?)
and then calling these functions before and after each major section of
database code? For the delivery chain, we could do it inside of
insert_message(). For dbmail-smtp, this basically means that the execution of
the whole program is within one transaction. For dbmail-lmtpd, it means that
each message is delivered within a transaction but the miscellaneous queries
before the main message delivery chain are not transacted. For MySQL, these
functions would be noops.
Thing that might work?
Aaron
Thomas Mueller <[EMAIL PROTECTED]> said:
> Hi Aaron,
>
> > Do you have any way of narrowing this down to specific queries that are
> > taking the longest and/or are being executed the most? That would
> > identify which low-level database functions are being called, then we
> > can just trace our way up the call chain to see who's misbehaving or
> > acting on a flawed design. Also, if you could run similar tests against
> > the latest 1.2, it would help to give a frame of reference, particularly
> > for my delivery chain design.
>
> That's simple: the main design flaw (actually that's no design flaw I
> think, that's because the so called database MySQL couldn't do
> transactions in the past) is that dbmail doesn't use transaction.
>
> Because of that AutoCommit is used and whenever dbmail does anySqlQuery
> Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly
> slow. To ensure the Durability in ACID the database has to fflush()
> every transaction to stable storage!
> That's why there is only one solution: we have to use transaction.
>
> With transactions we could remove the integrity checks of
> dbmail-maintenance too, because the database guarantees integrity.
>
> Anyway, I did a trace of all SQL queries when a mail is copied using
> IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> one mail?).
>
> When searching for the sequential scan I found something quite
> interesting in the docs: the planer decides for every scan if a seqScan
> is cheaper that an index scan, and does a seqScan even if an index
> exists:
>
> dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
> QUERY PLAN
> ---------------------------------------------------------
> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> Filter: (owner_idnr = 2)
> (2 rows)
>
> The table has an index on owner_idnr.
>
> So I should repeat this test with a database with several hundred to
> thousand user, several dozen mailboxes for each user and several dozen
> mails in each mailbox to find out if all required indizes are there.
> Did anyone write a script to create such a database?
>
> But I found a strange query:
> SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> owner_idnr = '2'
> mailbox_idnr is the primary key so that could be optimized to:
> SELECT 4
> ;-)
>
>
> I don't have a 1.2 installation, I'm sorry.
>
>
> --
> MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> _______________________________________________
> Dbmail-dev mailing list
> [email protected]
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
--