At 20:12 17.07.2003, Tom Lane wrote:
Fabian Kreitner <[EMAIL PROTECTED]> writes:
> That is what I read too and is why Im confused that the index is indeed
> executing faster. Can this be a problem with the hardware and/or postgress
> installation?

I think the actual issue here is that you are executing the EXISTS
subplan over and over, once for each outer row.  The planner's cost
estimate for EXISTS is based on the assumption that you do it once
... in which scenario the seqscan very possibly is cheaper.  However,
when you do the EXISTS subplan over and over for many outer rows, you
get a savings from the fact that the index and table pages soon get
cached in memory.  The seqscan plan gets a savings too, since the table
is small enough to fit in memory, but once everything is in memory the
indexscan plan is faster.

There's been some discussion on pghackers about how to teach the planner
to account for repeated executions of subplans, but we have not come up
with a good solution yet.

For the moment, what people tend to do if they know their database is
small enough to mostly stay in memory is to reduce random_page_cost to
make the planner favor indexscans.  If you know the database is entirely
cached then the theoretically correct value of random_page_cost is 1.0
(since fetching any page will cost the same, if it's all in RAM).  I'd
recommend against adopting that as a default, but a lot of people find
that setting it to 2.0 or so seems to model their situation better than
the out-of-the-box 4.0.

Thanks for the explanation :)



However .... :(


perg_1097=# vacuum analyze;
VACUUM
perg_1097=# set random_page_cost to 1.0;
SET VARIABLE
perg_1097=#  EXPLAIN ANALYZE    select  notiz_id, obj_id, obj_typ
perg_1097-#       from    notiz_objekt a
perg_1097-#       where not exists
perg_1097-#      (
perg_1097(#        select  1
perg_1097(#        from    notiz_gelesen b
perg_1097(#        where   ma_id  = 2001
perg_1097(#        and     ma_pid = 1097
perg_1097(#        and     a.notiz_id = b.notiz_id
perg_1097(#      )
perg_1097-#   ;
NOTICE:  QUERY PLAN:

Seq Scan on notiz_objekt a (cost=0.00..56125.80 rows=15561 width=12) (actual time=0.27..2299.09 rows=31122 loops=1)
SubPlan
-> Seq Scan on notiz_gelesen b (cost=0.00..1.79 rows=1 width=0) (actual time=0.07..0.07 rows=0 loops=31122)
Total runtime: 2328.05 msec


EXPLAIN
perg_1097=#

...

perg_1097=# set enable_seqscan to false;
SET VARIABLE
perg_1097=# set random_page_cost to 1.0;
SET VARIABLE
perg_1097=#  EXPLAIN ANALYZE    select  notiz_id, obj_id, obj_typ
perg_1097-#       from    notiz_objekt a
perg_1097-#       where not exists
perg_1097-#      (
perg_1097(#        select  1
perg_1097(#        from    notiz_gelesen b
perg_1097(#        where   ma_id  = 2001
perg_1097(#        and     ma_pid = 1097
perg_1097(#        and     a.notiz_id = b.notiz_id
perg_1097(#      )
perg_1097-#   ;
NOTICE:  QUERY PLAN:

Seq Scan on notiz_objekt a (cost=100000000.00..100093380.36 rows=15561 width=12) (actual time=0.07..550.07 rows=31122 loops=1)
SubPlan
-> Index Scan using idx_notiz_gelesen_2 on notiz_gelesen b (cost=0.00..2.98 rows=1 width=0) (actual time=0.01..0.01 rows=0 loops=31122)
Total runtime: 582.90 msec


EXPLAIN
perg_1097=#


Even with a random page cost of 1 it thinks using the index should/could take significantly longer which it doesnt for some reason :-/



---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Reply via email to