I saw a post that looked similar to a PostgreSQL IMAP performance issue I have
experienced. I have a relatively small system (< 15 GB of mail) and there is
one query that makes moving messages between IMAP take several seconds each.
I'm in the process of moving several thousand messages so I'm not sure it'll
really finish.
The original query looks like this:
SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON
m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.id JOIN
dbmail_headername n ON v.headername_id=n.id WHERE mailbox_idnr = 1329 AND
status IN (0,1) AND headername ILIKE 'MESSAGE-ID' AND
SUBSTRING(headervalue,0,255) ILIKE '%<secretstuff>%' ORDER BY message_idnr;
Even though there are indexes on header value and header name they cannot be
used due to the ILIKE clause. The query plan looks like this:
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=35461.02..35461.02 rows=1 width=8) (actual time=1494.757..1494.757
rows=0 loops=1)
Sort Key: m.message_idnr
-> Nested Loop (cost=4.02..35461.01 rows=1 width=8) (actual
time=1494.741..1494.741 rows=0 loops=1)
-> Nested Loop (cost=4.02..35454.99 rows=1 width=24) (actual
time=1494.738..1494.738 rows=0 loops=1)
-> Nested Loop (cost=4.02..35449.05 rows=1 width=32) (actual
time=1494.736..1494.736 rows=0 loops=1)
Join Filter: ("outer".physmessage_id =
"inner".physmessage_id)
-> Seq Scan on dbmail_headervalue v (cost=0.00..35429.65
rows=1 width=16) (actual time=1415.541..1493.264 rows=1 loops=1)
Filter: ("substring"(headervalue, 0, 255) ~~*
'%<secretstuff>%'::text)
-> Bitmap Heap Scan on dbmail_messages m
(cost=4.02..19.35 rows=4 width=16) (actual time=0.264..1.169 rows=501 loops=1)
Recheck Cond: (((mailbox_idnr = 1329) AND (status =
0)) OR ((mailbox_idnr = 1329) AND (status = 1)))
-> BitmapOr (cost=4.02..4.02 rows=4 width=0)
(actual time=0.229..0.229 rows=0 loops=1)
-> Bitmap Index Scan on dbmail_messages_8
(cost=0.00..2.01 rows=2 width=0) (actual time=0.217..0.217 rows=501 loops=1)
Index Cond: ((mailbox_idnr = 1329) AND
(status = 0))
-> Bitmap Index Scan on dbmail_messages_8
(cost=0.00..2.01 rows=2 width=0) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: ((mailbox_idnr = 1329) AND
(status = 1))
-> Index Scan using dbmail_headername_pkey on dbmail_headername
n (cost=0.00..5.92 rows=1 width=8) (never executed)
Index Cond: ("outer".headername_id = n.id)
Filter: ((headername)::text ~~* 'MESSAGE-ID'::text)
-> Index Scan using dbmail_physmessage_pkey on dbmail_physmessage p
(cost=0.00..6.01 rows=1 width=8) (never executed)
Index Cond: ("outer".physmessage_id = p.id)
Total runtime: 1494.879 ms
(21 rows)
I modified the query and performance is stellar but I'm unsure of any
unforeseen consequences. Here's what I came up with:
SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON
m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.id JOIN
dbmail_headername n ON v.headername_id=n.id WHERE mailbox_idnr = 1329 AND
status IN (0,1) AND lower(headername) = 'message-id' AND lower(headervalue) =
lower('<secretstuff>') ORDER BY message_idnr;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1173.20..1173.20 rows=1 width=8) (actual time=15.222..15.223
rows=1 loops=1)
Sort Key: m.message_idnr
-> Nested Loop (cost=197.55..1173.19 rows=1 width=8) (actual
time=4.852..15.213 rows=1 loops=1)
-> Hash Join (cost=197.55..1167.16 rows=1 width=24) (actual
time=4.784..15.143 rows=1 loops=1)
Hash Cond: ("outer".headername_id = "inner".id)
-> Nested Loop (cost=176.04..1145.08 rows=114 width=32)
(actual time=4.667..15.024 rows=1 loops=1)
-> Bitmap Heap Scan on dbmail_messages m
(cost=4.02..19.35 rows=4 width=16) (actual time=0.219..1.283 rows=501 loops=1)
Recheck Cond: (((mailbox_idnr = 1329) AND (status =
0)) OR ((mailbox_idnr = 1329) AND (status = 1)))
-> BitmapOr (cost=4.02..4.02 rows=4 width=0)
(actual time=0.192..0.192 rows=0 loops=1)
-> Bitmap Index Scan on dbmail_messages_8
(cost=0.00..2.01 rows=2 width=0) (actual time=0.183..0.183 rows=501 loops=1)
Index Cond: ((mailbox_idnr = 1329) AND
(status = 0))
-> Bitmap Index Scan on dbmail_messages_8
(cost=0.00..2.01 rows=2 width=0) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: ((mailbox_idnr = 1329) AND
(status = 1))
-> Bitmap Heap Scan on dbmail_headervalue v
(cost=172.02..281.08 rows=28 width=16) (actual time=0.025..0.025 rows=0
loops=501)
Recheck Cond: ((v.physmessage_id =
"outer".physmessage_id) AND (lower(v.headervalue) = '<secretstuff>'::text))
-> BitmapAnd (cost=172.02..172.02 rows=28 width=0)
(actual time=0.024..0.024 rows=0 loops=501)
-> Bitmap Index Scan on dbmail_headervalue_2
(cost=0.00..48.88 rows=5681 width=0) (actual time=0.013..0.013 rows=36
loops=501)
Index Cond: (v.physmessage_id =
"outer".physmessage_id)
-> Bitmap Index Scan on dbmail_headervalue_3
(cost=0.00..122.89 rows=5681 width=0) (actual time=0.009..0.009 rows=1
loops=501)
Index Cond: (lower(headervalue) =
'<secretstuff>'::text)
-> Hash (cost=21.47..21.47 rows=13 width=8) (actual
time=0.106..0.106 rows=4 loops=1)
-> Bitmap Heap Scan on dbmail_headername n
(cost=2.05..21.47 rows=13 width=8) (actual time=0.083..0.098 rows=4 loops=1)
Recheck Cond: (lower((headername)::text) =
'message-id'::text)
-> Bitmap Index Scan on dbmail_headername_2
(cost=0.00..2.05 rows=13 width=0) (actual time=0.065..0.065 rows=4 loops=1)
Index Cond: (lower((headername)::text) =
'message-id'::text)
-> Index Scan using dbmail_physmessage_pkey on dbmail_physmessage p
(cost=0.00..6.01 rows=1 width=8) (actual time=0.065..0.067 rows=1 loops=1)
Index Cond: ("outer".physmessage_id = p.id)
Total runtime: 15.337 ms
(28 rows)
My primary concerns:
- Are there any reasons why we need to only look at the first 255 characters
of the header name?
- Are there any reasons why we would be looking for the <secretstuff> string
in the middle of a header value?
If those are unfounded then I think this update makes a lot of sense. Any
thoughts on whether I can make this change safely?
Thanks,
Tim Mattison
[email protected]
_______________________________________________
Dbmail-dev mailing list
[email protected]
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev