Geo Carncross wrote:

> The dbmail_messages_physmessage_idx isn't the best index to use.
> 
> You have several options:
> 
> put extra columns in dbmail_physmessage that reference the message_idnr
> and mailbox_idnr (and index them WITH the dbmail_physmessage.id) -- THEN
> adapt the query to look for these fields (e.g. AND pm.message_idnr
> BETWEEN 1 AND 417966 AND msg.message_idnr BETWEEN 1 AND 417966)
> 
> This is ugly. But it's the fastest for Pg.

Very ugly yes, I prefer normalized dbs (especially when we don't use
transactions for changes :-/).

> Another option (probably would help other dbs too) is to use two
> subselects - one on dbmail_messages and one on dbmail_physmessage. This
> way, the best index for dbmail_messages can be used-
> 
> CREATE INDEX dbmail_messages_wide ON dbmail_messages
> (mailbox_idnr,message_idnr,status);
> CREATE INDEX dbmail_messages_wider ON dbmail_messages
> (mailbox_idnr,message_idnr,status,physmessage_id);
> 
> I'd have to experiment to find out which one would generate the better
> query plan -- only one of those two indexes should actually be used.
> 
> The query then should look like this:
[..]

That one is impressive too! More than 3 times faster than the original
query!
It doesn't matter if I add one of the indizes, the plan is always the same:

 Nested Loop IN Join  (cost=10637.41..16500.58 rows=37 width=36) (actual
time=169.007..13173.373 rows=1827 loops=1)
   ->  Merge Join  (cost=10637.41..14704.79 rows=2748 width=52) (actual
time=156.031..5950.658 rows=1827 loops=1)
         Merge Cond: ("outer".id = "inner".id)
         ->  Index Scan using dbmail_physmessage_pkey on
dbmail_physmessage  (cost=0.00..3842.59 rows=73428 width=24) (actual
time=12.297..5427.278 rows=103804 loops=1)
         ->  Sort  (cost=10637.41..10644.27 rows=2747 width=28) (actual
time=62.109..70.968 rows=1827 loops=1)
               Sort Key: a.id
               ->  Subquery Scan a  (cost=10446.17..10480.50 rows=2747
width=28) (actual time=31.461..53.653 rows=1827 loops=1)
                     ->  Sort  (cost=10446.17..10453.03 rows=2747
width=28) (actual time=31.441..37.874 rows=1827 loops=1)
                           Sort Key: message_idnr
                           ->  Index Scan using
dbmail_messages_tmm_status_recent_idx on dbmail_messages
(cost=0.00..10289.26 rows=2747 width=28) (actual time=0.084..21.449
rows=1827 loops=1)
                                 Index Cond: (mailbox_idnr = 81::bigint)
                                 Filter: ((message_idnr >= 1::bigint)
AND (message_idnr <= 417966::bigint) AND ((status = 0::smallint) OR
(status = 1::smallint)))
   ->  Index Scan using dbmail_messages_physmessage_idx on
dbmail_messages  (cost=0.00..4.35 rows=1 width=8) (actual
time=3.904..3.904 rows=1 loops=1827)
         Index Cond: ("outer".id = dbmail_messages.physmessage_id)
         Filter: ((message_idnr >= 1::bigint) AND (message_idnr <=
417966::bigint) AND (mailbox_idnr = 81::bigint) AND ((status =
0::smallint) OR (status = 1::smallint)))
 Total runtime: 13183.619 ms

dbmail_messages_tmm_status_recent_idx is the index I added yesterday
because of your improved query.

dbmail=# \d dbmail_messages_tmm_status_recent_idx
Index "public.dbmail_messages_tmm_status_recent_idx"
    Column    |   Type
--------------+----------
 mailbox_idnr | bigint
 status       | smallint
 recent_flag  | smallint
btree, for table "public.dbmail_messages"

That is much better but still very slow - I don't understand why
PostgreSQL is that slow here?


Thomas
-- 
http://www.tmueller.com for pgp key (95702B3B)

Reply via email to