Fabian Kreitner <[EMAIL PROTECTED]> writes: > That is what I read too and is why Im confused that the index is indeed > executing faster. Can this be a problem with the hardware and/or postgress > installation?
I think the actual issue here is that you are executing the EXISTS subplan over and over, once for each outer row. The planner's cost estimate for EXISTS is based on the assumption that you do it once ... in which scenario the seqscan very possibly is cheaper. However, when you do the EXISTS subplan over and over for many outer rows, you get a savings from the fact that the index and table pages soon get cached in memory. The seqscan plan gets a savings too, since the table is small enough to fit in memory, but once everything is in memory the indexscan plan is faster.
There's been some discussion on pghackers about how to teach the planner to account for repeated executions of subplans, but we have not come up with a good solution yet.
For the moment, what people tend to do if they know their database is small enough to mostly stay in memory is to reduce random_page_cost to make the planner favor indexscans. If you know the database is entirely cached then the theoretically correct value of random_page_cost is 1.0 (since fetching any page will cost the same, if it's all in RAM). I'd recommend against adopting that as a default, but a lot of people find that setting it to 2.0 or so seems to model their situation better than the out-of-the-box 4.0.
Thanks for the explanation :)
However .... :(
perg_1097=# vacuum analyze; VACUUM perg_1097=# set random_page_cost to 1.0; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN:
Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) (actual time=0.27..2299.09 rows=31122 loops=1)
SubPlan
-> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) (actual time=0.07..0.07 rows=0 loops=31122)
Total runtime: 2328.05 msec
EXPLAIN perg_1097=#
...
perg_1097=# set enable_seqscan to false; SET VARIABLE perg_1097=# set random_page_cost to 1.0; SET VARIABLE perg_1097=# EXPLAIN ANALYZE select notiz_id, obj_id, obj_typ perg_1097-# from notiz_objekt a perg_1097-# where not exists perg_1097-# ( perg_1097(# select 1 perg_1097(# from notiz_gelesen b perg_1097(# where ma_id = 2001 perg_1097(# and ma_pid = 1097 perg_1097(# and a.notiz_id = b.notiz_id perg_1097(# ) perg_1097-# ; NOTICE: QUERY PLAN:
Seq Scan on notiz_objekt a (cost=100000000.00..100093380.36 rows=15561 width=12) (actual time=0.07..550.07 rows=31122 loops=1)
SubPlan
-> Index Scan using idx_notiz_gelesen_2 on notiz_gelesen b (cost=0.00..2.98 rows=1 width=0) (actual time=0.01..0.01 rows=0 loops=31122)
Total runtime: 582.90 msec
EXPLAIN perg_1097=#
Even with a random page cost of 1 it thinks using the index should/could take significantly longer which it doesnt for some reason :-/
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?
http://archives.postgresql.org