[ADMIN] optimizer behavior in the case of highly updated tables

Mark Rostron Tue, 08 Jun 2010 20:19:53 -0700

I am looking for some specific information regarding optimizer behavior.
We recently experienced a situation where a query that was previously using a 
btree lookup (efficient) SWITCHED to using seqscan/hash lookup.


My questions would be:


-          Under what circumstances is the optimizer likely to CHANGE behavior 
from using a btree index lookup to using a seq scan/hash lookup?

-          What are the critical decision factors that would feed into the 
optimizer making such a change?

-          Is it possible to measure any metrics in a way that would enable a 
prediction of such a change?

Platform


-          8.3.10 (64bit) on RHEL5.

-          Linux xxxxx 2.6.18-164.10.1.el5xen #1 SMP Thu Jan 7 20:28:30 EST 
2010 x86_64 x86_64 x86_64 GNU/Linux

Application
The table in question is:

-          30m rows, variable length (contains varchar fields), rowlength avg 
about 120B

-          Approx. 3m unique values in the index column

Activity on the table would be, per row:  "Insert, multiple updates, delete 
after 90 days"

We vacuum analyze this table once/weekly.
No partitions are used.

Our experience which prompts this question was as follows:

-          If the table is not "vacuum analyze'd" at least once/week, the query 
plans become unpredictable as to whether they will use btree or seqscan/hash 
lookup

-          Until last week, "vacuum analyze" was sufficient

-          Friday evening of last week, the query plan for selected queries 
against this index changed again, but "vacuum analyze" was insufficient

-          Rebuilding index on primary key and on the column index was 
insufficient

-          It was necessary to take a site outage and perform a "vacuum full 
analyze" on the table

-          Following this, the query plan reverted to the more efficient btree 
lookup

Clearly, the garbage buildup resulting from transaction activity on the table 
is the villain here.

-          Is it possible to calculate expected space usage given row count and 
average row size

-          At what point might the ratio of "expected"/"actual" space usage be 
able to indicate the need to perform "full vacuum", or similar maintenance

Any observations/comments that anyone would care to make are welcome.
Thanks in advance for your time
Mr

[ADMIN] optimizer behavior in the case of highly updated tables

Reply via email to