I'm trying to port our application from MS-SQL to Postgres. We have
implemented all of our rather complicated application security in the
database. The query that follows takes a half of a second or less on
MS-SQL server and around 5 seconds on Postgres. My concern is that this
data set is rather "small" by our applications standards. It is not
unusual for the da_answer table to have 2-4 million records. I'm
worried that if this very small data set is taking 5 seconds, then a
"regular sized" data set will take far too long.
I originally thought the NOT EXISTS on the
"da_data_restrict_except_open" table was killing performance, but the
query took the exact same amount of time after I deleted all rows from
this table. Note that the hard-coded 999999999.0, and 4000 parameters,
as well as the parameter to svp_getparentproviders are the three
variables that change from one run of this query to the next.
I'm using Postgres 7.4.5 as packaged in Debian. shared_buffers is set
to 57344 and sort_mem=4096.
The machine has an AMD 1.8+ and ` gig of RAM. Here are some relevant
performance statistics:
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmmax
536870912
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmall
536870912
richard:/home/richard# hdparm -tT /dev/hda
Timing cached reads: 1112 MB in 2.00 seconds = 556.00 MB/sec
Timing buffered disk reads: 176 MB in 3.02 seconds = 58.28 MB/sec
I have included an EXPLAIN ANALYZE, relevant table counts, and relevant
indexing information. If anyone has any suggestions on how to improve
performance.... TIA!
SELECT tab.answer_id, client_id, question_id, recordset_id,
date_effective, virt_field_name
FROM
(
SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective
FROM da_answer a
WHERE a.date_effective <= 9999999999.0
AND a.inactive != 1
AND
(
5000 = 4000
OR
(EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE
svp_getparentproviderids = a.provider_id))
)
UNION
SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective
FROM da_answer a,
(
SELECT main_id
FROM da_data_restrict
WHERE type_id = 2
AND (provider_id IN (SELECT * FROM svp_getparentproviderids(1)))
UNION
SELECT sa.uid AS main_id
FROM da_answer sa
JOIN da_data_restrict_except_closed dr ON dr.main_id =
sa.uid AND dr.type_id = 2 AND dr.except_provider_id = 1
WHERE (restricted = 1)
AND (restricted_closed_except = 1)
AND sa.covered_by_roi = 1
UNION
SELECT sa.uid AS main_id
FROM da_answer sa
WHERE (restricted = 0)
AND (restricted_open_except = 1)
AND (NOT EXISTS (SELECT dr.main_id FROM
da_data_restrict_except_open dr WHERE (dr.main_id = sa.uid) AND
(dr.type_id = 2) AND (dr.except_provider_id in (select * from
svp_getparentproviderids(1)))))
AND sa.covered_by_roi = 1
UNION
SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted
= 0) AND (restricted_open_except = 0)
AND sa.covered_by_roi = 1
) sec
WHERE a.covered_by_roi = 1
AND a.date_effective <= 9999999999.0
AND a.inactive != 1
AND a.uid = sec.main_id
AND 5000 > 4000
) tab, da_question q
WHERE tab.question_id = q.uid AND (min_access_level <= 4000 OR
min_access_level IS NULL)
Table counts from relevant tables
da_question 1095
da_answer 21117
da_question 1095
da_data_restrict_except_closed 3087
da_data_restrict_except_open 13391
svp_getparentproviderids(1) 1
Relevant Index
create index in_da_data_restrict_provider_id on
da_data_restrict(provider_id);
create index in_da_data_restrict_main_id on da_data_restrict(main_id);
create index in_da_data_restrict_type_id on da_data_restrict(type_id);
create index in_da_data_restrict_client_id on
da_data_restrict(client_id);
create index in_da_dr_type_provider on
da_data_restrict(type_id,provider_id);
create index in_da_data_rec_provider_id ON
da_data_restrict_except_closed(provider_id);
create index in_da_data_rec_type_id ON
da_data_restrict_except_closed(type_id);
create index in_da_data_rec_main_id ON
da_data_restrict_except_closed(main_id);
create index in_da_data_rec_except_provider_id ON
da_data_restrict_except_closed(except_provider_id);
create index in_da_data_reo_provider_id ON
da_data_restrict_except_open(provider_id);
create index in_da_data_reo_type_id ON
da_data_restrict_except_open(type_id);
create index in_da_data_reo_main_id ON
da_data_restrict_except_open(main_id);
create index in_da_data_reo_except_provider_id ON
da_data_restrict_except_open(except_provider_id);
create index in_da_answer_client_id ON da_answer(client_id);
create index in_da_answer_provider_id ON da_answer(provider_id);
create index in_da_answer_question_id ON da_answer(question_id);
create index in_da_answer_recordset_id ON da_answer(recordset_id);
create index in_da_answer_restricted ON da_answer(restricted);
create index in_da_answer_restricted_open_except ON
da_answer(restricted_open_except);
create index in_da_answer_restricted_closed_except ON
da_answer(restricted_closed_except);
create index in_da_answer_date_effective ON da_answer(date_effective);
create index in_da_answer_inactive ON da_answer(inactive);
create index in_da_answer_covered_by_roi ON da_answer(covered_by_roi);
create index in_da_ed_inactive_roi ON da_answer(date_effective,inactive,
covered_by_roi);
create index in_da_question_mal ON da_question(min_access_level);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1054186.23..1054631.36 rows=4496 width=67) (actual
time=4902.250..4979.060 rows=7653 loops=1)
Hash Cond: ("outer".question_id = "inner".uid)
-> Subquery Scan tab (cost=1054123.62..1054457.09 rows=13339 width=24)
(actual time=4896.963..4933.896 rows=7657 loops=1)
-> Unique (cost=1054123.62..1054323.70 rows=13339 width=24) (actual
time=4896.948..4915.498 rows=7657 loops=1)
-> Sort (cost=1054123.62..1054156.96 rows=13339 width=24)
(actual time=4896.944..4903.402 rows=7717 loops=1)
Sort Key: answer_id, client_id, question_id, recordset_id,
date_effective
-> Append (cost=0.00..1053209.67 rows=13339 width=24)
(actual time=279.091..4841.605 rows=7717 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15
rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1)
-> Seq Scan on da_answer a
(cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808
rows=161 loops=1)
Filter: ((date_effective <=
9999999999::double precision) AND (inactive <> 1) AND (subplan))
SubPlan
-> Function Scan on
svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4) (actual
time=0.203..0.203 rows=0 loops=21089)
Filter:
(svp_getparentproviderids = $1)
-> Subquery Scan "*SELECT* 2"
(cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720
rows=7556 loops=1)
-> Hash Join (cost=988627.58..989147.53
rows=2799 width=24) (actual time=290.722..395.739 rows=7556 loops=1)
Hash Cond: ("outer".main_id =
"inner".uid)
-> Subquery Scan sec
(cost=987913.23..988002.59 rows=5957 width=4) (actual time=203.862..225.462
rows=7567 loops=1)
-> Unique
(cost=987913.23..987943.02 rows=5957 width=4) (actual time=203.851..215.834
rows=7567 loops=1)
-> Sort
(cost=987913.23..987928.12 rows=5957 width=4) (actual time=203.843..207.273
rows=7567 loops=1)
Sort Key: main_id
-> Append
(cost=160.61..987539.72 rows=5957 width=4) (actual time=35.798..187.293
rows=7567 loops=1)
-> Subquery
Scan "*SELECT* 1" (cost=160.61..164.53 rows=46 width=4) (actual
time=35.796..35.923 rows=35 loops=1)
-> Hash
Join (cost=160.61..164.07 rows=46 width=4) (actual time=35.791..35.868 rows=35
loops=1)
Hash
Cond: ("outer".svp_getparentproviderids = "inner".provider_id)
->
HashAggregate (cost=15.00..15.00 rows=200 width=4) (actual time=0.491..0.492
rows=1 loops=1)
-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000
width=4) (actual time=0.473..0.474 rows=1 loops=1)
->
Hash (cost=145.50..145.50 rows=46 width=8) (actual time=35.256..35.256 rows=0
loops=1)
-> Index Scan using in_da_dr_type_provider on da_data_restrict
(cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627 rows=7280
loops=1)
Index Cond: (type_id = 2)
-> Subquery
Scan "*SELECT* 2" (cost=0.00..53.69 rows=1 width=4) (actual time=0.030..0.030
rows=0 loops=1)
-> Nested
Loop (cost=0.00..53.68 rows=1 width=4) (actual time=0.028..0.028 rows=0
loops=1)
->
Index Scan using in_da_data_rec_except_provider_id on
da_data_restrict_except_closed dr (cost=0.00..50.65 rows=1 width=4) (actual
time=0.026..0.026 rows=0 loops=1)
Index Cond: (except_provider_id = 1)
Filter: (type_id = 2)
->
Index Scan using da_answer_pkey on da_answer sa (cost=0.00..3.02 rows=1
width=4) (never executed)
Index Cond: ("outer".main_id = sa.uid)
Filter: ((restricted = 1) AND (restricted_closed_except = 1) AND
(covered_by_roi = 1))
-> Subquery
Scan "*SELECT* 3" (cost=0.00..986638.62 rows=678 width=4) (actual
time=0.346..77.393 rows=1841 loops=1)
-> Index
Scan using in_da_answer_restricted_open_except on da_answer sa
(cost=0.00..986631.84 rows=678 width=4) (actual time=0.342..74.614 rows=1841
loops=1)
Index Cond: (restricted_open_except = 1)
Filter: ((restricted = 0) AND (covered_by_roi = 1) AND (NOT (subplan)))
SubPlan
->
Nested Loop IN Join (cost=0.00..227.09 rows=1 width=4) (actual
time=0.026..0.026 rows=0 loops=1841)
Join Filter: ("outer".except_provider_id = "inner".svp_getparentproviderids)
-> Index Scan using in_da_data_reo_main_id on da_data_restrict_except_open
dr (cost=0.00..212.09 rows=1 width=8) (actual time=0.009..0.014 rows=2
loops=1841)
Index Cond: (main_id = $0)
Filter: (type_id = 2)
-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000
width=4) (actual time=0.002..0.003 rows=1 loops=3793)
-> Subquery
Scan "*SELECT* 4" (cost=0.00..682.87 rows=5232 width=4) (actual
time=0.064..69.107 rows=5691 loops=1)
-> Seq
Scan on da_answer sa (cost=0.00..630.55 rows=5232 width=4) (actual
time=0.059..61.052 rows=5691 loops=1)
Filter: ((restricted = 0) AND (restricted_open_except = 0) AND (covered_by_roi
= 1))
-> Hash (cost=630.55..630.55 rows=9922
width=24) (actual time=86.699..86.699 rows=0 loops=1)
-> Seq Scan on da_answer a
(cost=0.00..630.55 rows=9922 width=24) (actual time=0.043..73.232 rows=10062
loops=1)
Filter: ((covered_by_roi =
1) AND (date_effective <= 9999999999::double precision) AND (inactive <> 1))
-> Hash (cost=61.69..61.69 rows=369 width=47) (actual time=5.241..5.241
rows=0 loops=1)
-> Seq Scan on da_question q (cost=0.00..61.69 rows=369 width=47)
(actual time=0.026..4.071 rows=1087 loops=1)
Filter: ((min_access_level <= 4000) OR (min_access_level IS
NULL))
Total runtime: 4986.508 ms
(58 rows)
count
-------
21117
(1 row)
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match