Re: document the need to analyze partitioned tables

Andrey Lepikhov Wed, 05 Oct 2022 23:02:21 -0700

On 10/5/22 13:37, Laurenz Albe wrote:

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).


I'd like to suggest an improvement to this.  The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.


By the way, where are the statistics of a partitioned tables used?  The actual
tables scanned are always the partitions, and in the execution plans that
I have seen, the optimizer always used the statistics of the partitions.

For example, it is used to estimate selectivity of join clause:

CREATE TABLE test (id integer, val integer) PARTITION BY hash (id);
CREATE TABLE test_0 PARTITION OF test
  FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE test_1 PARTITION OF test
  FOR VALUES WITH (modulus 2, remainder 1);

INSERT INTO test (SELECT q, q FROM generate_series(1,10) AS q);
VACUUM ANALYZE test;
INSERT INTO test (SELECT q, q%2 FROM generate_series(11,200) AS q);
VACUUM ANALYZE test_0,test_1;

EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
VACUUM ANALYZE test;
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;

Here without actual statistics on parent table we make wrong prediction.

--
Regards
Andrey Lepikhov
Postgres Professional

Re: document the need to analyze partitioned tables

Reply via email to